Actions
Bug #19427
closedcommon/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size())
% Done:
0%
Source:
Tags:
Backport:
jewel, kraken
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
-35> 2017-03-29 22:36:36.972567 7f2538e96700 20 osd.2 756 should_share_map osd.1 172.21.15.117:6814/13151 756 -34> 2017-03-29 22:36:36.972564 7f2546eb2700 1 -- 172.21.15.117:6808/1991 <== mon.2 172.21.15.117:6790/0 3 ==== auth_reply(proto 2 0 (0) Success) v1 ==== 580+0+0 (1929996202 0 0) 0x565217318800 con 0x56521713b3c0 -33> 2017-03-29 22:36:36.972571 7f2538e96700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] handle_message: 0x565216b7db00 -32> 2017-03-29 22:36:36.972575 7f2538e96700 15 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] do_sub_op osd_sub_op(unknown.0.0:0 0.28 MIN [scrub-map] v 0'0 snapset=0=[]:[]) v12 -31> 2017-03-29 22:36:36.972579 7f2538e96700 7 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] sub_op_scrub_map -30> 2017-03-29 22:36:36.972582 7f2538e96700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] got 1 scrub map -29> 2017-03-29 22:36:36.972586 7f2538e96700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] map version is 0'0 -28> 2017-03-29 22:36:36.972589 7f2538e96700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] requeue_scrub: queueing -27> 2017-03-29 22:36:36.972591 7f2538e96700 20 osd.2 op_wq(0) _enqueue 0.28 PGQueueable(PGScrub prio 5 cost 52428800 e756) -26> 2017-03-29 22:36:36.972596 7f2538e96700 10 osd.2 756 dequeue_op 0x565216b7db00 finish -25> 2017-03-29 22:36:36.972600 7f253b69b700 20 osd.2 op_wq(0) _process 0.28 item PGQueueable(PGScrub prio 5 cost 52428800 e756) queued -24> 2017-03-29 22:36:36.972602 7f2538e96700 20 osd.2 op_wq(0) _process empty q, waiting -23> 2017-03-29 22:36:36.972608 7f253b69b700 20 osd.2 op_wq(0) _process 0.28 item PGQueueable(PGScrub prio 5 cost 52428800 e756) pg 0x565215be2000 -22> 2017-03-29 22:36:36.972608 7f2546eb2700 1 -- 172.21.15.117:6808/1991 mark_down 0x565217138000 -- 0x56521562b400 -21> 2017-03-29 22:36:36.972610 7f253b69b700 20 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub state WAIT_REPLICAS -20> 2017-03-29 22:36:36.972614 7f253b69b700 20 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub state COMPARE_MAPS -19> 2017-03-29 22:36:36.972617 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_compare_maps has maps, analyzing -18> 2017-03-29 22:36:36.972619 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_compare_maps comparing replica scrub maps -17> 2017-03-29 22:36:36.972622 7f253b69b700 2 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_compare_maps osd.2 has 0 items -16> 2017-03-29 22:36:36.972625 7f253b69b700 2 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_compare_maps replica 1 has 0 items -15> 2017-03-29 22:36:36.972629 7f253b69b700 2 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] -14> 2017-03-29 22:36:36.972630 7f2546eb2700 1 monclient: found mon.c -13> 2017-03-29 22:36:36.972632 7f2546eb2700 10 monclient: _send_mon_message to mon.c at 172.21.15.117:6790/0 -12> 2017-03-29 22:36:36.972632 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_snapshot_metadata -11> 2017-03-29 22:36:36.972633 7f2546eb2700 1 -- 172.21.15.117:6808/1991 --> 172.21.15.117:6790/0 -- mon_subscribe({mgrmap=0+,monmap=3+,osd_pg_creates=0+,osdmap=757}) v2 -- ?+0 0x565216867400 con 0x56521713b3c0 -10> 2017-03-29 22:36:36.972636 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_snapshot_metadata (scrub) finish -9> 2017-03-29 22:36:36.972639 7f2546eb2700 10 monclient: _send_mon_message to mon.c at 172.21.15.117:6790/0 -8> 2017-03-29 22:36:36.972642 7f2546eb2700 1 -- 172.21.15.117:6808/1991 --> 172.21.15.117:6790/0 -- pg_stats(39 pgs tid 177 v 0) v1 -- ?+0 0x565215f23e40 con 0x56521713b3c0 -7> 2017-03-29 22:36:36.972640 7f253b69b700 20 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub state FINISH -6> 2017-03-29 22:36:36.972645 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub got 0/0 objects, 0/0 clones, 0/0 dirty, 0/0 omap, 0/0 pinned, 0/0 hit_set_archive, 0/0 bytes, 0/0 hit_set_archive bytes. -5> 2017-03-29 22:36:36.972654 7f2546eb2700 10 monclient: _send_mon_message to mon.c at 172.21.15.117:6790/0 -4> 2017-03-29 22:36:36.972652 7f253b69b700 10 osd.2 pg_epoch: 756 pg[0.28( empty local-les=727 n=0 ec=47 les/c/f 727/727/0 725/725/163) [2,1] r=0 lpr=725 crt=0'0 mlcod 0'0 active+clean+scrubbing] scrub_process_inconsistent: checking authoritative -3> 2017-03-29 22:36:36.972656 7f2546eb2700 1 -- 172.21.15.117:6808/1991 --> 172.21.15.117:6790/0 -- pg_stats(39 pgs tid 178 v 0) v1 -- ?+0 0x5652197f19c0 con 0x56521713b3c0 -2> 2017-03-29 22:36:36.972659 7f253b69b700 0 log_channel(cluster) log [INF] : 0.28 scrub ok -1> 2017-03-29 22:36:36.972660 7f2546eb2700 10 log_client log_queue is 3 last_log 600 sent 596 num 3 unsent 4 sending 4 0> 2017-03-29 22:36:36.976280 7f2546eb2700 -1 /build/ceph-12.0.0-2001-gcc51f56/src/common/LogClient.cc: In function 'Message* LogClient::_get_mon_log_message()' thread 7f2546eb2700 time 2017-03-29 22:36:36.972662 /build/ceph-12.0.0-2001-gcc51f56/src/common/LogClient.cc: 310: FAILED assert(num_unsent <= log_queue.size()) ceph version 12.0.0-2001-gcc51f56 (cc51f56bb2c153bf35078eb95311e1d35c118c9f) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x56520b4d28b2] 2: (LogClient::_get_mon_log_message()+0x13e5) [0x56520b482d25] 3: (LogClient::get_mon_log_message()+0x1f) [0x56520b482ecf] 4: (MonClient::send_log()+0x15) [0x56520b4e64a5] 5: (MonClient::handle_auth(MAuthReply*)+0x348) [0x56520b4ef488] 6: (MonClient::ms_dispatch(Message*)+0x39b) [0x56520b4efa9b] 7: (DispatchQueue::entry()+0xf4a) [0x56520b69e94a] 8: (DispatchQueue::DispatchThread::entry()+0xd) [0x56520b55a33d] 9: (()+0x770a) [0x7f25590bd70a] 10: (clone()+0x6d) [0x7f255813482d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
this was on wip-mgr-commands, but no changes were made the osd or logclient, so I'm guessing this is related to the monclient multiconnection refactor?
/a/sage-2017-03-29_21:10:58-rados-wip-mgr-commands---basic-smithi/962237
Actions