Project

General

Profile

Bug #4539

include/elist.h: 92: FAILED assert(_head.empty()) from MDLog::standby_trim_segments()

Added by Sage Weil about 11 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2013-03-24T09:37:07.830 INFO:teuthology.task.ceph.mds.b-s-a.out:starting mds.b-s-a at :/0
2013-03-24T09:37:08.235 INFO:teuthology.task.mds_thrash.mds_thrasher.failure_group.[a, b-s-a]:mds.b-s-a reported in up:standby-replay state
2013-03-24T09:37:08.235 INFO:teuthology.task.mds_thrash.mds_thrasher.failure_group.[a, b-s-a]:waiting for 19 secs before thrashing
2013-03-24T09:37:24.885 INFO:teuthology.task.ceph.mds.b-s-a.err:./include/elist.h: In function 'elist<T>::~elist() [with T = BacktraceInfo*]' thread 7f3467846700 time 2013-03-24 09:37:20.437604
2013-03-24T09:37:24.885 INFO:teuthology.task.ceph.mds.b-s-a.err:./include/elist.h: 92: FAILED assert(_head.empty())
2013-03-24T09:37:24.885 INFO:teuthology.task.ceph.mds.b-s-a.err: ceph version 0.59-478-g8befbca (8befbca77aa50a1188969892aabedaf11d8f8ce7)
2013-03-24T09:37:24.885 INFO:teuthology.task.ceph.mds.b-s-a.err: 1: (MDLog::standby_trim_segments()+0xce5) [0x6ccec5]
2013-03-24T09:37:24.886 INFO:teuthology.task.ceph.mds.b-s-a.err: 2: (MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x4e86b9]
2013-03-24T09:37:24.886 INFO:teuthology.task.ceph.mds.b-s-a.err: 3: (Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190) [0x6d3210]
2013-03-24T09:37:24.886 INFO:teuthology.task.ceph.mds.b-s-a.err: 4: (Filer::_probed(Filer::Probe*, object_t const&, unsigned long, utime_t)+0x558) [0x704a88]
2013-03-24T09:37:24.886 INFO:teuthology.task.ceph.mds.b-s-a.err: 5: (Objecter::C_Stat::finish(int)+0xc0) [0x705900]
2013-03-24T09:37:24.886 INFO:teuthology.task.ceph.mds.b-s-a.err: 6: (Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe38) [0x6f1df8]
2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 7: (MDS::handle_core_message(Message*)+0xae8) [0x4dc318]
2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 8: (MDS::_dispatch(Message*)+0x2f) [0x4dc4df]
2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 9: (MDS::ms_dispatch(Message*)+0x1db) [0x4ddf7b]
2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 10: (DispatchQueue::entry()+0x341) [0x81f561]
2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 11: (DispatchQueue::DispatchThread::entry()+0xd) [0x79c6ad]
2013-03-24T09:37:24.887 INFO:teuthology.task.ceph.mds.b-s-a.err: 12: (()+0x7e9a) [0x7f346bb9ee9a]
2013-03-24T09:37:24.888 INFO:teuthology.task.ceph.mds.b-s-a.err: 13: (clone()+0x6d) [0x7f346a3574bd]

on job
machine_type: plana
nuke-on-error: true
overrides:
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 20
        debug paxos: 20
    log-whitelist:
    - slow request
    sha1: 8befbca77aa50a1188969892aabedaf11d8f8ce7
  s3tests:
    branch: master
  workunit:
    sha1: 8befbca77aa50a1188969892aabedaf11d8f8ce7
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
  - mds.b-s-a
tasks:
- chef: null
- install: null
- ceph: null
- mds_thrash: null
- ceph-fuse: null
- workunit:
    clients:
      all:
      - suites/fsstress.sh

yay, thrasher!

Associated revisions

Revision 0e009b1b (diff)
Added by Sam Lang about 11 years ago

mds: Clear backtrace updates on standby_trim_seg

If the mds is standby, when a segment is trimmed, we need
to clear the backtrace updates list to avoid the following
assertion when the segment is deleted.

./include/elist.h: 92: FAILED assert(_head.empty())
ceph version 0.59-478-g8befbca (8befbca77aa50a1188969892aabedaf11d8f8ce7)
(MDLog::standby_trim_segments()+0xce5) [0x6ccec5]
(MDS::C_MDS_StandbyReplayRestartFinish::finish(int)+0x39) [0x4e86b9]
(Journaler::_finish_reprobe(int, unsigned long, Context*)+0x190)
[0x6d3210]
(Filer::_probed(Filer::Probe*, object_t const&, unsigned long,
utime_t)+0x558) [0x704a88]
(Objecter::C_Stat::finish(int)+0xc0) [0x705900]
(Objecter::handle_osd_op_reply(MOSDOpReply*)+0xe38) [0x6f1df8]
(MDS::handle_core_message(Message*)+0xae8) [0x4dc318]
(MDS::_dispatch(Message*)+0x2f) [0x4dc4df]
(MDS::ms_dispatch(Message*)+0x1db) [0x4ddf7b]
(DispatchQueue::entry()+0x341) [0x81f561]
(DispatchQueue::DispatchThread::entry()+0xd) [0x79c6ad]
(()+0x7e9a) [0x7f346bb9ee9a]
(clone()+0x6d) [0x7f346a3574bd]

Fixes #4539.
Signed-off-by: Sam Lang <>

History

#1 Updated by Sage Weil about 11 years ago

also ubuntu@teuthology:/a/sage-2013-03-24_08:29:36-fs-master-testing-basic/2414

#2 Updated by Sage Weil about 11 years ago

ubuntu@teuthology:/a/sage-2013-03-24_08:29:36-fs-master-testing-basic/2410

#3 Updated by Sage Weil about 11 years ago

  • Assignee set to Sam Lang

I think this is as simple as

diff --git a/src/mds/MDLog.cc b/src/mds/MDLog.cc
index 7502e68..5389743 100644
--- a/src/mds/MDLog.cc
+++ b/src/mds/MDLog.cc
@@ -622,6 +622,7 @@ void MDLog::standby_trim_segments()
     seg->dirty_dirfrag_dir.clear_list();
     seg->dirty_dirfrag_nest.clear_list();
     seg->dirty_dirfrag_dirfragtree.clear_list();
+    seg->update_backtraces.clear_list();
     remove_oldest_segment();
     removed_segment = true;
   }

unless there is a state bit that goes with membership on that list, in which case that needs to be cleared as well. after this function, the cache state should be the same as if we had started replay one segment later in the journal.

#4 Updated by Sam Lang about 11 years ago

  • Status changed from 12 to Fix Under Review

Yep. There's no state bit, and the cache is unchanged by the backtrace updates list. The standby mds is free to clear this list.

I pushed wip-4539 and submitted a pull request.

Sorry - I missed this bug previously, I think because the subject didn't start with 'mds'.

#5 Updated by Sage Weil about 11 years ago

  • Status changed from Fix Under Review to Resolved

commit:295c92c

#6 Updated by Greg Farnum over 7 years ago

  • Component(FS) MDS added

Also available in: Atom PDF