Bug #4539
include/elist.h: 92: FAILED assert(_head.empty()) from MDLog::standby_trim_segments()
0%
Description
2013-03-24T09:37:07.830 INFO:teuthology.task.ceph.mds.b-s-a.out:starting mds.b-s-a at :/0 2013-03-24T09:37:08.235 INFO:teuthology.task.mds_thrash.mds_thrasher.failure_group.[a, b-s-a]:mds.b-s-a reported in up:standby-replay state 2013-03-24T09:37:08.235 INFO:teuthology.task.mds_thrash.mds_thrasher.failure_group.[a, b-s-a]:waiting for 19 secs before thrashing 2013-03-24T09:37:24.885 INFO:teuthology.task.ceph.mds.b-s-a.err:./include/elist.h: In function 'elist<T>::~elist() [with T = BacktraceInfo*]' thread 7f3467846700 time 2013-03-24 09:37:20.437604 2013-03-24T09:37:24.885 INFO:teuthology.task.ceph.mds.b-s-a.err:./include/elist.h: 92: FAILED assert(_head.empty()) 2013-03-24T09:37:24.885 INFO:teuthology.task.ceph.mds.b-s-a.err: ceph version 0.59-478-g8befbca (8befbca77aa50a1188969892aabedaf11d8f8ce7) 2013-03-24T09:37:24.885 INFO:teuthology.task.ceph.mds.b-s-a.err: 1: (MDLog::standby_trim_segments()+0xce5) [0x6ccec5] 2013-03-24T09:37:24.886 INFO:teuthology.task.ceph.mds.b-s-a.err: 2: (MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x4e86b9] 2013-03-24T09:37:24.886 INFO:teuthology.task.ceph.mds.b-s-a.err: 3: (Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190) [0x6d3210] 2013-03-24T09:37:24.886 INFO:teuthology.task.ceph.mds.b-s-a.err: 4: (Filer::_probed(Filer::Probe*, object_t const&, unsigned long, utime_t)+0x558) [0x704a88] 2013-03-24T09:37:24.886 INFO:teuthology.task.ceph.mds.b-s-a.err: 5: (Objecter::C_Stat::finish(int)+0xc0) [0x705900] 2013-03-24T09:37:24.886 INFO:teuthology.task.ceph.mds.b-s-a.err: 6: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe38) [0x6f1df8] 2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 7: (MDS::handle_core_message(Message*)+0xae8) [0x4dc318] 2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 8: (MDS::_dispatch(Message*)+0x2f) [0x4dc4df] 2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 9: (MDS::ms_dispatch(Message*)+0x1db) [0x4ddf7b] 2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 10: (DispatchQueue::entry()+0x341) [0x81f561] 2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x79c6ad] 2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 12: (()+0x7e9a) [0x7f346bb9ee9a] 2013-03-24T09:37:24.888 INFO:teuthology.task.ceph.mds.b-s-a.err: 13: (clone()+0x6d) [0x7f346a3574bd]
on job
machine_type: plana nuke-on-error: true overrides: ceph: conf: mon: debug mon: 20 debug ms: 20 debug paxos: 20 log-whitelist: - slow request sha1: 8befbca77aa50a1188969892aabedaf11d8f8ce7 s3tests: branch: master workunit: sha1: 8befbca77aa50a1188969892aabedaf11d8f8ce7 roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - - client.0 - mds.b-s-a tasks: - chef: null - install: null - ceph: null - mds_thrash: null - ceph-fuse: null - workunit: clients: all: - suites/fsstress.sh
yay, thrasher!
Associated revisions
mds: Clear backtrace updates on standby_trim_seg
If the mds is standby, when a segment is trimmed, we need
to clear the backtrace updates list to avoid the following
assertion when the segment is deleted.
./include/elist.h: 92: FAILED assert(_head.empty())
ceph version 0.59-478-g8befbca (8befbca77aa50a1188969892aabedaf11d8f8ce7)
(MDLog::standby_trim_segments()+0xce5) [0x6ccec5]
(MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x4e86b9]
(Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190)
[0x6d3210]
(Filer::_probed(Filer::Probe*, object_t const&, unsigned long,
utime_t)+0x558) [0x704a88]
(Objecter::C_Stat::finish(int)+0xc0) [0x705900]
(Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe38) [0x6f1df8]
(MDS::handle_core_message(Message*)+0xae8) [0x4dc318]
(MDS::_dispatch(Message*)+0x2f) [0x4dc4df]
(MDS::ms_dispatch(Message*)+0x1db) [0x4ddf7b]
(DispatchQueue::entry()+0x341) [0x81f561]
(DispatchQueue::DispatchThread::entry()+0xd) [0x79c6ad]
(()+0x7e9a) [0x7f346bb9ee9a]
(clone()+0x6d) [0x7f346a3574bd]
Fixes #4539.
Signed-off-by: Sam Lang <sam.lang@inktank.com>
History
#1 Updated by Sage Weil about 11 years ago
also ubuntu@teuthology:/a/sage-2013-03-24_08:29:36-fs-master-testing-basic/2414
#2 Updated by Sage Weil about 11 years ago
ubuntu@teuthology:/a/sage-2013-03-24_08:29:36-fs-master-testing-basic/2410
#3 Updated by Sage Weil about 11 years ago
- Assignee set to Sam Lang
I think this is as simple as
diff --git a/src/mds/MDLog.cc b/src/mds/MDLog.cc index 7502e68..5389743 100644 --- a/src/mds/MDLog.cc +++ b/src/mds/MDLog.cc @@ -622,6 +622,7 @@ void MDLog::standby_trim_segments() seg->dirty_dirfrag_dir.clear_list(); seg->dirty_dirfrag_nest.clear_list(); seg->dirty_dirfrag_dirfragtree.clear_list(); + seg->update_backtraces.clear_list(); remove_oldest_segment(); removed_segment = true; }
unless there is a state bit that goes with membership on that list, in which case that needs to be cleared as well. after this function, the cache state should be the same as if we had started replay one segment later in the journal.
#4 Updated by Sam Lang about 11 years ago
- Status changed from 12 to Fix Under Review
Yep. There's no state bit, and the cache is unchanged by the backtrace updates list. The standby mds is free to clear this list.
I pushed wip-4539 and submitted a pull request.
Sorry - I missed this bug previously, I think because the subject didn't start with 'mds'.
#5 Updated by Sage Weil about 11 years ago
- Status changed from Fix Under Review to Resolved
commit:295c92c
#6 Updated by Greg Farnum over 7 years ago
- Component(FS) MDS added