Actions
Bug #4837
closedmon: FAILED assert(!(sync_role & SYNC_ROLE_REQUESTER))
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I just upgraded 3 monitors from 0.56.4 to 0.60 (next branch) and saw a monitor crash when I ran:
$ ceph osd unset noout
I upgraded the three mons first to 0.60 and after that was done I set noout:
$ ceph osd set noout
After all OSDs were up and running again I wanted to unset the flag, which caused the monitor to crash.
The logs showed:
-11> 2013-04-26 21:33:07.030062 7f52043f5700 2 -- 10.23.24.8:6789/0 >> 10.23.24.58:6812/30099 pipe(0x312ff280 sd=7 :6789 s=2 pgs=117 cs=1 l=1).fault 0: Success -10> 2013-04-26 21:33:07.056466 7f52043f5700 1 -- 10.23.24.8:6789/0 >> :/0 pipe(0x2c9f8500 sd=7 :6789 s=0 pgs=0 cs=0 l=0).accept sd=7 10.23.24.57:53299/0 -9> 2013-04-26 21:33:07.077770 7f52029e2700 1 -- 10.23.24.8:6789/0 >> :/0 pipe(0x389ab000 sd=30 :6789 s=0 pgs=0 cs=0 l=0).accept sd=30 10.23.24.58:56868/0 -8> 2013-04-26 21:33:07.148088 7f5207315700 1 -- 10.23.24.8:6789/0 --> mon.1 10.23.24.9:6789/0 -- mon_sync( start ) v1 -- ?+0 0x26da580 -7> 2013-04-26 21:33:07.148337 7f5207315700 1 -- 10.23.24.8:6789/0 <== mon.2 10.23.24.10:6789/0 119568 ==== mon_probe(reply 4a5a7836-1d75-4e09-b3dc-80e5ae13d681 name mon3 quorum 0,1,2 paxos( fc 175909 lc 175933 )) v4 ==== 579+0+0 (2730281254 0 0) 0x7a95900 con 0x278f9a0 -6> 2013-04-26 21:33:07.148442 7f5207315700 1 -- 10.23.24.8:6789/0 <== mon.0 10.23.24.8:6789/0 0 ==== mon_sync( start_reply ) v1 ==== 0+0+0 (0 0 0) 0x2e7252c0 con 0x278f2c 0 -5> 2013-04-26 21:33:07.148541 7f5207315700 1 -- 10.23.24.8:6789/0 --> mon.0 10.23.24.8:6789/0 -- mon_sync( heartbeat ) v1 -- ?+0 0x2ef7f580 -4> 2013-04-26 21:33:07.148586 7f5207315700 1 -- 10.23.24.8:6789/0 --> mon.0 10.23.24.8:6789/0 -- mon_sync( start_chunks ) v1 -- ?+0 0x2ef7f840 -3> 2013-04-26 21:33:07.148622 7f5207315700 1 -- 10.23.24.8:6789/0 <== mon.0 10.23.24.8:6789/0 0 ==== mon_sync( heartbeat ) v1 ==== 0+0+0 (0 0 0) 0x2ef7f580 con 0x278f2c0 -2> 2013-04-26 21:33:07.148667 7f5207315700 1 mon.mon1@0(synchronizing sync( requester state chunks )) e1 handle_sync_heartbeat ignored stray message mon_sync( heartbeat ) v1 -1> 2013-04-26 21:33:07.148698 7f5207315700 1 -- 10.23.24.8:6789/0 <== mon.0 10.23.24.8:6789/0 0 ==== mon_sync( start_chunks ) v1 ==== 0+0+0 (0 0 0) 0x2ef7f840 con 0x278f2c0 0> 2013-04-26 21:33:07.150441 7f5207315700 -1 mon/Monitor.cc: In function 'void Monitor::handle_sync_start_chunks(MMonSync*)' thread 7f5207315700 time 2013-04-26 21:33:07.148735 mon/Monitor.cc: 1136: FAILED assert(!(sync_role & SYNC_ROLE_REQUESTER)) ceph version 0.60-669-ga2a23cc (a2a23ccd959f6e7ebe1533b27e7320902624523b) 1: (Monitor::handle_sync_start_chunks(MMonSync*)+0x84d) [0x4b060d] 2: (Monitor::handle_sync(MMonSync*)+0x2cb) [0x4c1b4b] 3: (Monitor::_ms_dispatch(Message*)+0xd90) [0x4c2ae0] 4: (Monitor::ms_dispatch(Message*)+0x32) [0x4d96f2] 5: (DispatchQueue::entry()+0x3f1) [0x6aef01] 6: (DispatchQueue::DispatchThread::entry()+0xd) [0x63d55d] 7: (()+0x7e9a) [0x7f520c078e9a] 8: (clone()+0x6d) [0x7f520a628ccd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
The full log is attached.
I set the target version to 0.61 since this was on the next branch which will be the cuttlefish release soon.
Files
Actions