Actions
Bug #1708
closedmon/PGMonitor.cc: 218: FAILED assert(paxos->get_version() + 1 == pending_inc.version)
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Running ceph version from git: a3dd5bd67ba19aae51a51318138ef10213a91449
Slaves are all ubuntu 11.10, 3.0.0-12
Filesystem is ext4
I have a 3 slave cluster, each one running osd, mds, and mon. I had a qemu running rbd across the cluster and was testing failover. Using /etc/init.d/ceph stop/start to stop and start individual nodes. It worked a few times, but then at one point the mon process on one of the slaves crashed.
The mon.0.log is attached.
The assert and backtrace:
2011-11-10 16:34:46.211114 7ffc6d4d0700 mon.0@0(leader) e1 handle_command mon_command(health v 0) v1 2011-11-10 16:34:52.792849 7ffc6cccf700 log [INF] : mon.0@0 won leader election with quorum 0,1 2011-11-10 16:34:58.979151 7ffc6d4d0700 log [INF] : mds.? 192.168.122.74:6800/9301 up:boot 2011-11-10 16:34:58.983151 7ffc6d4d0700 mon.0@0(leader) e1 handle_command mon_command(health v 0) v1 mon/PGMonitor.cc: In function 'virtual void PGMonitor::encode_pending(ceph::bufferlist&)', in thread '7ffc6d4d0700' mon/PGMonitor.cc: 218: FAILED assert(paxos->get_version() + 1 == pending_inc.version) ceph version 0.37-364-ga3dd5bd (commit:a3dd5bd67ba19aae51a51318138ef10213a91449) 1: (PGMonitor::encode_pending(ceph::buffer::list&)+0x108) [0x4c9fd8] 2: (PaxosService::propose_pending()+0xd2) [0x48e532] 3: (PGMonitor::check_osd_map(unsigned int)+0xca0) [0x4d0520] 4: (Context::complete(int)+0xa) [0x478ffa] 5: (finish_contexts(CephContext*, std::list<Context*, std::allocator<Context*> >&, int)+0xca) [0x47a52a] 6: (Paxos::handle_accept(MMonPaxos*)+0x5d8) [0x488948] 7: (Paxos::dispatch(PaxosServiceMessage*)+0x23b) [0x48b66b] 8: (Monitor::_ms_dispatch(Message*)+0xb99) [0x478409] 9: (Monitor::ms_dispatch(Message*)+0x35) [0x4832b5] 10: (SimpleMessenger::dispatch_entry()+0x84b) [0x57612b] 11: (SimpleMessenger::DispatchThread::entry()+0x1c) [0x462e9c] 12: (()+0x7efc) [0x7ffc70d95efc] 13: (clone()+0x6d) [0x7ffc6f7cf89d]
Files
Actions