Project

General

Profile

Actions

Bug #5250

closed

ceph-mds 0.61.2 aborts on start

Added by Jérôme Poulin almost 11 years ago. Updated almost 8 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After rebooting the whole cluster using the "shut the braker off" method, I had some BTRFS corruption which was fixed using btrfs scrub and OSD corruption which was fixed using pg repair. Afterward, cluster was HEALTH_WARN with all PGs active+clean, then I tried starting the MDS and it aborts.

2013-06-04 18:07:46.983018 7fadd60617c0 0 ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60), process ceph-mds, pid 11826
starting mds.1 at :/0
2013-06-04 18:07:47.387643 7fadd0aa6700 1 mds.-1.0 handle_mds_map standby
2013-06-04 18:07:47.704154 7fadd0aa6700 1 mds.0.45 handle_mds_map i am now mds.0.45
2013-06-04 18:07:47.704159 7fadd0aa6700 1 mds.0.45 handle_mds_map state change up:standby --> up:replay
2013-06-04 18:07:47.704171 7fadd0aa6700 1 mds.0.45 replay_start
2013-06-04 18:07:47.704190 7fadd0aa6700 1 mds.0.45 recovery set is
2013-06-04 18:07:47.704193 7fadd0aa6700 1 mds.0.45 need osdmap epoch 1710, have 1709
2013-06-04 18:07:47.704196 7fadd0aa6700 1 mds.0.45 waiting for osdmap 1710 (which blacklists prior instance)
2013-06-04 18:07:47.711616 7fadd0aa6700 0 mds.0.cache creating system inode with ino:100
2013-06-04 18:07:47.712032 7fadd0aa6700 0 mds.0.cache creating system inode with ino:1
mds/journal.cc: In function 'void EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)' thread 7fadcd594700 time 2013-06-04 18:07:49.260303
mds/journal.cc: 1170: FAILED assert(in->first p->dnfirst || (in->is_multiversion() && in->first > p->dnfirst))
ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60)
1: (EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)+0x3af5) [0x523235]
2: (EUpdate::replay(MDS*)+0x3a) [0x52bd9a]
3: (MDLog::_replay_thread()+0x5cf) [0x6f42cf]
4: (MDLog::ReplayThread::entry()+0xd) [0x505cbd]
5: (()+0x7f8e) [0x7fadd5c3ef8e]
6: (clone()+0x6d) [0x7fadd455be1d]
NOTE: a copy of the executable, or `objdump rdS <executable>` is needed to interpret this.
2013-06-04 18:07:49.261299 7fadcd594700 -1 mds/journal.cc: In function 'void EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)' thread 7fadcd594700 time 2013-06-04 18:07:49.260303
mds/journal.cc: 1170: FAILED assert(in
>first p->dnfirst || (in->is_multiversion() && in->first > p->dnfirst))

ceph version 0.61.2 (fea782543a844bb277ae94d3391788b76c5bee60)
1: (EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)+0x3af5) [0x523235]
2: (EUpdate::replay(MDS*)+0x3a) [0x52bd9a]
3: (MDLog::_replay_thread()+0x5cf) [0x6f42cf]
4: (MDLog::ReplayThread::entry()+0xd) [0x505cbd]
5: (()+0x7f8e) [0x7fadd5c3ef8e]
6: (clone()+0x6d) [0x7fadd455be1d]
NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Files

mds.log.xz (7.16 MB) mds.log.xz Jérôme Poulin, 06/04/2013 12:34 PM
Actions

Also available in: Atom PDF