Bug #5250
closedceph-mds 0.61.2 aborts on start
0%
Description
After rebooting the whole cluster using the "shut the braker off" method, I had some BTRFS corruption which was fixed using btrfs scrub and OSD corruption which was fixed using pg repair. Afterward, cluster was HEALTH_WARN with all PGs active+clean, then I tried starting the MDS and it aborts.
2013-06-04 18:07:46.983018 7fadd60617c0 0 ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mds, pid 11826
starting mds.1 at :/0
2013-06-04 18:07:47.387643 7fadd0aa6700 1 mds.-1.0 handle_mds_map standby
2013-06-04 18:07:47.704154 7fadd0aa6700 1 mds.0.45 handle_mds_map i am now mds.0.45
2013-06-04 18:07:47.704159 7fadd0aa6700 1 mds.0.45 handle_mds_map state change up:standby --> up:replay
2013-06-04 18:07:47.704171 7fadd0aa6700 1 mds.0.45 replay_start
2013-06-04 18:07:47.704190 7fadd0aa6700 1 mds.0.45 recovery set is
2013-06-04 18:07:47.704193 7fadd0aa6700 1 mds.0.45 need osdmap epoch 1710, have 1709
2013-06-04 18:07:47.704196 7fadd0aa6700 1 mds.0.45 waiting for osdmap 1710 (which blacklists prior instance)
2013-06-04 18:07:47.711616 7fadd0aa6700 0 mds.0.cache creating system inode with ino:100
2013-06-04 18:07:47.712032 7fadd0aa6700 0 mds.0.cache creating system inode with ino:1
mds/journal.cc: In function 'void EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)' thread 7fadcd594700 time 2013-06-04 18:07:49.260303
mds/journal.cc: 1170: FAILED assert(in->first p->dnfirst || (in->is_multiversion() && in->first > p->dnfirst))
ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60)
1: (EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)+0x3af5) [0x523235]
2: (EUpdate::replay(MDS*)+0x3a) [0x52bd9a]
3: (MDLog::_replay_thread()+0x5cf) [0x6f42cf]
4: (MDLog::ReplayThread::entry()+0xd) [0x505cbd]
5: (()+0x7f8e) [0x7fadd5c3ef8e]
6: (clone()+0x6d) [0x7fadd455be1d]
NOTE: a copy of the executable, or `objdump rdS <executable>` is needed to interpret this.>first p->dnfirst || (in->is_multiversion() && in->first > p->dnfirst))
2013-06-04 18:07:49.261299 7fadcd594700 -1 mds/journal.cc: In function 'void EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)' thread 7fadcd594700 time 2013-06-04 18:07:49.260303
mds/journal.cc: 1170: FAILED assert(in
ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60)
1: (EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)+0x3af5) [0x523235]
2: (EUpdate::replay(MDS*)+0x3a) [0x52bd9a]
3: (MDLog::_replay_thread()+0x5cf) [0x6f42cf]
4: (MDLog::ReplayThread::entry()+0xd) [0x505cbd]
5: (()+0x7f8e) [0x7fadd5c3ef8e]
6: (clone()+0x6d) [0x7fadd455be1d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Files