Bug #10368
closedAssertion in _trim_expired_segments
0%
Description
544669 -20> 2014-12-18 05:15:49.295240 7f642f89e700 10 mds.0.objecter _op_submit oid 200.00000002 @3 @3 [write 1813550~1736] tid 0 osd.0 544670 -19> 2014-12-18 05:15:49.295243 7f64321a4700 5 mds.0.objecter 0 unacked, 4 uncommitted 544671 -18> 2014-12-18 05:15:49.295245 7f64321a4700 6 mds.0.journal LogSegment(5026/10197802).try_to_expire waiting 544672 -17> 2014-12-18 05:15:49.295246 7f64321a4700 5 mds.0.log try_expire expiring segment 5026/10197802 544673 -16> 2014-12-18 05:15:49.295246 7f642f89e700 20 mds.0.objecter get_session s=0x2e9f000 osd=0 4 544674 -15> 2014-12-18 05:15:49.295248 7f642f89e700 15 mds.0.objecter _session_op_assign 0 1577 544675 -14> 2014-12-18 05:15:49.295249 7f64321a4700 6 mds.0.journal LogSegment(5028/10198820).try_to_expire 544676 -13> 2014-12-18 05:15:49.295249 7f642f89e700 15 mds.0.objecter _send_op 1577 to osd.0 544677 -12> 2014-12-18 05:15:49.295250 7f64321a4700 10 mds.0.journal try_to_expire saving sessionmap, need 1009, committed is 1007 (1009) 544678 -11> 2014-12-18 05:15:49.295252 7f64321a4700 10 mds.0.sessionmap save needv 1009, v 1009 544679 -10> 2014-12-18 05:15:49.295252 7f642f89e700 20 mds.0.objecter put_session s=0x2e9f000 osd=0 5 544680 -9> 2014-12-18 05:15:49.295253 7f64321a4700 6 mds.0.journal LogSegment(5028/10198820).try_to_expire waiting 544681 -8> 2014-12-18 05:15:49.295254 7f642f89e700 5 mds.0.objecter 0 unacked, 4 uncommitted 544682 -7> 2014-12-18 05:15:49.295255 7f64321a4700 5 mds.0.log try_expire expiring segment 5028/10198820 544683 -6> 2014-12-18 05:15:49.295257 7f64321a4700 6 mds.0.journal LogSegment(5030/10199909).try_to_expire 544684 -5> 2014-12-18 05:15:49.295258 7f64321a4700 6 mds.0.journal LogSegment(5030/10199909).try_to_expire success 544685 -4> 2014-12-18 05:15:49.295259 7f64321a4700 10 mds.0.log try_expire expired segment 5030/10199909 544686 -3> 2014-12-18 05:15:49.295261 7f64321a4700 5 mds.0.log _expired segment 5030/10199909, 2 events 544687 -2> 2014-12-18 05:15:49.295263 7f64321a4700 10 mds.0.log _trim_expired_segments trimming expired 4022/0x897ea8 544688 -1> 2014-12-18 05:15:49.295266 7f64321a4700 10 mds.0.log _trim_expired_segments trimming expired 4024/0x898570 544689 0> 2014-12-18 05:15:49.296896 7f64321a4700 -1 ./include/elist.h: In function 'elist<T>::~elist() [with T = CDentry*]' thread 7f64 321a4700 time 2014-12-18 05:15:49.295269 544690 ./include/elist.h: 92: FAILED assert(_head.empty()) 544691 544692 ceph version 0.89-755-g2d76bc2 (2d76bc2d989118f14f7b3a98ec11a3c87dc318da) 544693 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x7f) [0x9494df] 544694 2: (MDLog::_trim_expired_segments()+0x63e) [0x7d50ee] 544695 3: (MDLog::trim(int)+0x170) [0x7d7280] 544696 4: (MDS::tick()+0x3dc) [0x5a2f6c] 544697 5: (MDSInternalContextBase::complete(int)+0x1db) [0x7ca7ab] 544698 6: (SafeTimer::timer_thread()+0x3e5) [0x93b025] 544699 7: (SafeTimerThread::entry()+0xd) [0x93bbbd] 544700 8: (()+0x7e9a) [0x7f643a375e9a] 544701 9: (clone()+0x6d) [0x7f6438d3c31d] 544702 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Happened on mounting a client after mds-auto-repair test executed, investigating...
Investigating
Files
Updated by John Spray over 9 years ago
Reproduce (took two tries to see the crash again, so it's intermittent)
tasks: - install: - ceph: - ceph-fuse: - mds-auto-repair: - mds-auto-repair:
Updated by John Spray over 9 years ago
- File 10368.sh 10368.sh added
- File mds.a.log.short.gz mds.a.log.short.gz added
Had no joy reproducing with mds_auto_repair more recently (on diff. test nodes), but did manage on a vstart cluster against master with the attached .sh running in a loop, a client with debug output turned off, and the following settings:
mds cache size = 100 mds cache log max segments = 2 mds cache log events per segment = 1 mds cache verify backtrace = 1
The attached log is the last part of a much longer log.
Notes:
Segment that has the unwanted item_dirty is seq 170750 2015-01-14 14:50:36.996372 7f71e5197700 1 mds.0.1 asok_command: flush journal (starting...) 2015-01-14 14:50:36.998726 7f71e5197700 7 mds.0.log _prepare_new_segment seq 170750 *** this is the msgr thread, starting a new segment just after the one that the asok thread created *** cap release on 1000000548c, which is AKA /testdir2/file57 testdir2 is 10000005452 2015-01-14 14:50:37.360868 7f71e3994700 10 mds.0.log submit_entry also starting new segment: last = 170750/222043046, event seq = 170961 2015-01-14 14:50:38.702326 7f71e5197700 5 mds.0.1 _command_flush_journal: beginning segment expiry 2015-01-14 14:50:38.702352 7f71e5197700 6 mds.0.journal LogSegment(170750/222043046).try_to_expire 2015-01-14 14:50:38.702360 7f71e5197700 6 mds.0.journal LogSegment(170750/222043046).try_to_expire success 2015-01-14 14:50:38.702363 7f71e5197700 10 mds.0.log try_expire expired segment 170750/222043046 2015-01-14 14:50:38.702367 7f71e5197700 5 mds.0.log _expired segment 170750/222043046, 212 events The testdir2/file57 gets marked clean here: 2015-01-14 14:50:41.259444 7f71e118f700 15 mds.0.cache.dir(10000005452) dir 1277 >= inode 1208 now clean [inode 1000000548c [2,head] /testdir2/file57 auth v1208 s=0 n(v0 1=1+0) (iversion lock) | ptrwaiter=0 request=0 lock=0 caps=0 dirtyparent=0 dirty=1 authpin=0 0x6d29c20] 2015-01-14 14:50:41.259452 7f71e118f700 10 mds.0.cache.ino(1000000548c) mark_clean [inode 1000000548c [2,head] /testdir2/file57 auth v1208 s=0 n(v0 1=1+0) (iversion lock) | ptrwaiter=0 request=0 lock=0 caps=0 dirtyparent=0 dirty=1 authpin=0 0x6d29c20] 2015-01-14 14:50:41.259464 7f71e118f700 10 mds.0.cache.den(10000005452 file57) mark_clean [dentry #1/testdir2/file57 [2,head] auth (dversion lock) v=1208 inode=0x6d29c20 | request=0 lock=0 inodepin=0 dirty=1 authpin=0 0x78ee240] 2015-01-14 14:50:41.273594 7f71e5197700 5 mds.0.1 _command_flush_journal: expiry complete, expire_pos/trim_pos is now d01beba/d000000 2015-01-14 14:50:41.273622 7f71e5197700 10 mds.0.log _trim_expired_segments trimming expired 170750/0xd3c1ba6
So the weird thing is that as far as I can tell mark_clean is getting called for the only file that was operated on during the problematic segment's lifetime, and yet the item is still in the dirty_dentries list.
My wip-10368-repro branch has some extra debugging, but so far I haven't seen the bug reproduce while the extra debugging was live :-/
Updated by Zheng Yan over 9 years ago
one possible cause:
there are more than one contexts in mdlog's wait_for_safe list. there is a context executed after MDS::_command_flush_journal()'s mdlog_flushed. the context marks dentry dirty, but the corresponding log segment has already been expired by the flush journal command.
Updated by Greg Farnum about 9 years ago
We saw this on our internal lab install as well.
[ubuntu@magna125 ~]$ zless /var/log/ceph/ceph-mds.magna125.log-20150204.gz
Updated by John Spray about 9 years ago
- Status changed from In Progress to Fix Under Review
Updated by Greg Farnum about 9 years ago
- Status changed from Fix Under Review to Pending Backport
This certainly needs to go into hammer as well, but I'm not quite brave enough to do an insta-cherry-pick. :)
Updated by John Spray about 9 years ago
Agree about backport, but we can be leisurely about it -- nobody's is likely to hit this bug unless they're hitting the journal flush asok command frequently for some reason
Updated by John Spray about 9 years ago
- Status changed from Pending Backport to In Progress
By sheer coincidence, I've hit this issue again, while no journal flushes were running at all, and much more reproducibly. I think we may have a more fundamental issue with MDLog completions (that call mark_dirty on things) getting called after MDLog has decided to expire a segment. The decision to expire a segment is made by directly checking the safe_pos of journaller, whereas the completions have to travel through a finisher. We might need to send safe_pos updates via the same finisher to ensure ordering.
Updated by Greg Farnum about 9 years ago
- Status changed from In Progress to Resolved
Finally merged to master by commit:b25e9e130bffed82ae5dda08449ea440fe6c485c