Actions
Bug #16924
closedCrash replaying EExport
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
-34> 2016-08-04 17:25:40.600671 7f4c8b13c700 10 mds.0.log _replay_thread start -33> 2016-08-04 17:25:40.600750 7f4c8b13c700 1 -- 192.168.1.13:6813/6779 --> 192.168.1.13:6804/6255 -- osd_op(mds.4118.0:8637 2.1ba6bc1f 200.00000004 [read 1315022~2924 [fadvise_dontneed]] snapc 0=[] ack+read+known_if_redirected+full_force e11) v7 -- ?+0 0xc098840 con 0xba99b00 -32> 2016-08-04 17:25:40.600838 7f4c8b33e700 10 _calc_signature seq 8635 front_crc_ = 346259589 middle_crc = 0 data_crc = 0 sig = 9525467307877532421 -31> 2016-08-04 17:25:40.600845 7f4c8b33e700 20 Putting signature in client message(seq # 8635): sig = 9525467307877532421 -30> 2016-08-04 17:25:40.600903 7f4c8b13c700 1 reusing last freed id 68 -29> 2016-08-04 17:25:40.601893 7f4c8b23d700 10 _calc_signature seq 8635 front_crc_ = 2485487617 middle_crc = 0 data_crc = 504523221 sig = 7175534316790889072 -28> 2016-08-04 17:25:40.601914 7f4c8b23d700 1 -- 192.168.1.13:6813/6779 <== osd.1 192.168.1.13:6804/6255 8635 ==== osd_op_reply(8637 200.00000004 [read 1315022~2924 [fadvise_dontneed]] v0'0 uv136 ondisk = 0) v7 ==== 132+0+2924 (2485487617 0 504523221) 0xc4118c0 con 0xba99b00 -27> 2016-08-04 17:25:40.602105 7f4c8b13c700 10 mds.0.log _replay 18092238~1256 / 18095162 2016-08-04 17:25:39.576059: EExport 10000000000 [metablob 1, 2 dirs] -26> 2016-08-04 17:25:40.602120 7f4c8b13c700 10 mds.0.journal EExport.replay 10000000000 -25> 2016-08-04 17:25:40.602122 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay 2 dirlumps by unknown.0 -24> 2016-08-04 17:25:40.602127 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay dir 1 -23> 2016-08-04 17:25:40.602133 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay updated dir [dir 1 / [2,head] auth v=444 cv=0/0 dir_auth=0 state=1610612736 f(v0 m2016-08-04 17:24:31.430676 1=0+1) n(v6 rc2016-08-04 17:25:33.930024 b25257853 2556=2219+337) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 0xbbb6000] -22> 2016-08-04 17:25:40.602179 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay for [2,head] had [dentry #1/etc [2,head] auth (dversion lock) v=443 inode=0xbba2ba0 | inodepin=1 dirty=1 0xc392000] -21> 2016-08-04 17:25:40.602197 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay for [2,head] had [inode 10000000000 [...2,head] /etc/ auth v443 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2556=2219+337) (inest sync dirty) (iversion lock) | dirtyscattered=1 dirfrag=1 dirtyparent=0 dirty=1 0xbba2ba0] -20> 2016-08-04 17:25:40.602227 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay dir 10000000000 -19> 2016-08-04 17:25:40.602233 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay updated dir [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 dirty=1 0xbbb6610] -18> 2016-08-04 17:25:40.602252 7f4c8b13c700 7 mds.0.cache adjust_bounded_subtree_auth -1,-2 -> -2,-2 on [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 dirty=1 0xbbb6610] bounds -17> 2016-08-04 17:25:40.602265 7f4c8b13c700 15 mds.0.cache show_subtrees -16> 2016-08-04 17:25:40.602277 7f4c8b13c700 10 mds.0.cache |__ 0 auth [dir 100 ~mds0/ [2,head] auth v=1 cv=0/0 dir_auth=0 state=1073741824 f(v0 10=0+10) n(v0 10=0+10) hs=0+0,ss=0+0 | subtree=1 0xbbb6308] -15> 2016-08-04 17:25:40.602287 7f4c8b13c700 10 mds.0.cache |__ 0 auth [dir 1 / [2,head] auth v=444 cv=0/0 dir_auth=0 state=1610612736 f(v0 m2016-08-04 17:24:31.430676 1=0+1) n(v6 rc2016-08-04 17:25:33.930024 b25257853 2556=2219+337) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 0xbbb6000] -14> 2016-08-04 17:25:40.602300 7f4c8b13c700 7 mds.0.cache current root is [dir 1 / [2,head] auth v=444 cv=0/0 dir_auth=0 state=1610612736 f(v0 m2016-08-04 17:24:31.430676 1=0+1) n(v6 rc2016-08-04 17:25:33.930024 b25257853 2556=2219+337) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 0xbbb6000] -13> 2016-08-04 17:25:40.602310 7f4c8b13c700 10 mds.0.cache new subtree at [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 dirty=1 0xbbb6610] -12> 2016-08-04 17:25:40.602336 7f4c8b13c700 10 mds.0.cache.dir(10000000000) setting dir_auth=-2,-2 from -1,-2 on [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 subtree=1 dirty=1 0xbbb6610] -11> 2016-08-04 17:25:40.602362 7f4c8b13c700 10 mds.0.cache.dir(10000000000) new subtree root, adjusting auth_pins -10> 2016-08-04 17:25:40.602382 7f4c8b13c700 15 mds.0.cache show_subtrees -9> 2016-08-04 17:25:40.602402 7f4c8b13c700 10 mds.0.cache |____ 0 auth [dir 100 ~mds0/ [2,head] auth v=1 cv=0/0 dir_auth=0 state=1073741824 f(v0 10=0+10) n(v0 10=0+10) hs=0+0,ss=0+0 | subtree=1 0xbbb6308] -8> 2016-08-04 17:25:40.602424 7f4c8b13c700 10 mds.0.cache |_.__ 0 auth [dir 1 / [2,head] auth v=444 cv=0/0 dir_auth=0 state=1610612736 f(v0 m2016-08-04 17:24:31.430676 1=0+1) n(v6 rc2016-08-04 17:25:33.930024 b25257853 2556=2219+337) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 0xbbb6000] -7> 2016-08-04 17:25:40.602458 7f4c8b13c700 10 mds.0.cache |__-2 auth [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 dir_auth=-2 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 subtree=1 dirty=1 0xbbb6610] -6> 2016-08-04 17:25:40.602487 7f4c8b13c700 10 mds.0.cache try_trim_nonauth_subtree [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 dir_auth=-2 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 subtree=1 dirty=1 0xbbb6610] -5> 2016-08-04 17:25:40.602505 7f4c8b13c700 10 mds.0.cache trim_non_auth_subtree(0xbbb6610) [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 dir_auth=-2 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 subtree=1 dirty=1 0xbbb6610] -4> 2016-08-04 17:25:40.602529 7f4c8b13c700 10 mds.0.cache trim_non_auth_subtree(0xbbb6610) Checking dentry 0xc71f690 -3> 2016-08-04 17:25:40.602543 7f4c8b13c700 20 mds.0.cache trim_non_auth_subtree(0xbbb6610) removing inode 0xca3fa70 with dentry0xc71f690 -2> 2016-08-04 17:25:40.602555 7f4c8b13c700 12 mds.0.cache.dir(10000000000) unlink_inode [dentry #1/etc/pcmcia [2,head] auth (dversion lock) v=1532 inode=0xca3fa70 0xc71f690] [inode 100000004c2 [...2,head] /etc/pcmcia/ auth v1532 f(v0 m2016-08-04 17:24:52.963509 1=1+0) n(v0 rc2016-08-04 17:24:52.963509 b1117 2=1+1) (iversion lock) 0xca3fa70] -1> 2016-08-04 17:25:40.602588 7f4c8b13c700 14 mds.0.cache remove_inode [inode 100000004c2 [...2,head] #100000004c2/ auth v1532 f(v0 m2016-08-04 17:24:52.963509 1=1+0) n(v0 rc2016-08-04 17:24:52.963509 b1117 2=1+1) (iversion lock) 0xca3fa70] 0> 2016-08-04 17:25:40.778768 7f4c8b13c700 -1 /home/jspray/git/ceph/src/mds/MDCache.cc: In function 'bool MDCache::trim_non_auth_subtree(CDir*)' thread 7f4c8b13c700 time 2016-08-04 17:25:40.602607 /home/jspray/git/ceph/src/mds/MDCache.cc: 6869: FAILED assert(!dir->has_bloom()) ceph version v11.0.0-1132-gfcdb1b1 (fcdb1b1bc1a20b2c9a7637af68fa55db5233994e) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x95) [0x194d292] 2: (MDCache::trim_non_auth_subtree(CDir*)+0x80b) [0x157b00b] 3: (MDCache::try_trim_non_auth_subtree(CDir*)+0x3e8) [0x157b9da] 4: (EExport::replay(MDSRank*)+0x365) [0x18558a9] 5: (MDLog::_replay_thread()+0x17e0) [0x1802524] 6: (MDLog::ReplayThread::entry()+0x1c) [0x1429a4e] 7: (Thread::entry_wrapper()+0xc1) [0x1aee961] 8: (Thread::_entry_func(void*)+0x18) [0x1aee896] 9: (()+0x7555) [0x7f4c986af555] 10: (clone()+0x6d) [0x7f4c97538ded] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Seeing this in a 2xactive + 2xstandby-replay configuration, while copying in a deep tree (my trusty ol' /etc dir) with the MDS cache size set to 1000.
Seems to reproduce easily.
Actions