Project

General

Profile

Bug #16924

Crash replaying EExport

Added by John Spray about 3 years ago. Updated 6 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
Start date:
08/04/2016
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:

Description

   -34> 2016-08-04 17:25:40.600671 7f4c8b13c700 10 mds.0.log _replay_thread start
   -33> 2016-08-04 17:25:40.600750 7f4c8b13c700  1 -- 192.168.1.13:6813/6779 --> 192.168.1.13:6804/6255 -- osd_op(mds.4118.0:8637 2.1ba6bc1f 200.00000004 [read 1315022~2924 [fadvise_dontneed]] snapc 0=[] ack+read+known_if_redirected+full_force e11) v7 -- ?+0 0xc098840 con 0xba99b00
   -32> 2016-08-04 17:25:40.600838 7f4c8b33e700 10 _calc_signature seq 8635 front_crc_ = 346259589 middle_crc = 0 data_crc = 0 sig = 9525467307877532421
   -31> 2016-08-04 17:25:40.600845 7f4c8b33e700 20 Putting signature in client message(seq # 8635): sig = 9525467307877532421
   -30> 2016-08-04 17:25:40.600903 7f4c8b13c700  1 reusing last freed id 68
   -29> 2016-08-04 17:25:40.601893 7f4c8b23d700 10 _calc_signature seq 8635 front_crc_ = 2485487617 middle_crc = 0 data_crc = 504523221 sig = 7175534316790889072
   -28> 2016-08-04 17:25:40.601914 7f4c8b23d700  1 -- 192.168.1.13:6813/6779 <== osd.1 192.168.1.13:6804/6255 8635 ==== osd_op_reply(8637 200.00000004 [read 1315022~2924 [fadvise_dontneed]] v0'0 uv136 ondisk = 0) v7 ==== 132+0+2924 (2485487617 0 504523221) 0xc4118c0 con 0xba99b00
   -27> 2016-08-04 17:25:40.602105 7f4c8b13c700 10 mds.0.log _replay 18092238~1256 / 18095162 2016-08-04 17:25:39.576059: EExport 10000000000 [metablob 1, 2 dirs]
   -26> 2016-08-04 17:25:40.602120 7f4c8b13c700 10 mds.0.journal EExport.replay 10000000000
   -25> 2016-08-04 17:25:40.602122 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay 2 dirlumps by unknown.0
   -24> 2016-08-04 17:25:40.602127 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay dir 1
   -23> 2016-08-04 17:25:40.602133 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay updated dir [dir 1 / [2,head] auth v=444 cv=0/0 dir_auth=0 state=1610612736 f(v0 m2016-08-04 17:24:31.430676 1=0+1) n(v6 rc2016-08-04 17:25:33.930024 b25257853 2556=2219+337) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 0xbbb6000]
   -22> 2016-08-04 17:25:40.602179 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay for [2,head] had [dentry #1/etc [2,head] auth (dversion lock) v=443 inode=0xbba2ba0 | inodepin=1 dirty=1 0xc392000]
   -21> 2016-08-04 17:25:40.602197 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay for [2,head] had [inode 10000000000 [...2,head] /etc/ auth v443 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2556=2219+337) (inest sync dirty) (iversion lock) | dirtyscattered=1 dirfrag=1 dirtyparent=0 dirty=1 0xbba2ba0]
   -20> 2016-08-04 17:25:40.602227 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay dir 10000000000
   -19> 2016-08-04 17:25:40.602233 7f4c8b13c700 10 mds.0.journal EMetaBlob.replay updated dir [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 dirty=1 0xbbb6610]
   -18> 2016-08-04 17:25:40.602252 7f4c8b13c700  7 mds.0.cache adjust_bounded_subtree_auth -1,-2 -> -2,-2 on [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 dirty=1 0xbbb6610] bounds 
   -17> 2016-08-04 17:25:40.602265 7f4c8b13c700 15 mds.0.cache show_subtrees
   -16> 2016-08-04 17:25:40.602277 7f4c8b13c700 10 mds.0.cache |__ 0    auth [dir 100 ~mds0/ [2,head] auth v=1 cv=0/0 dir_auth=0 state=1073741824 f(v0 10=0+10) n(v0 10=0+10) hs=0+0,ss=0+0 | subtree=1 0xbbb6308]
   -15> 2016-08-04 17:25:40.602287 7f4c8b13c700 10 mds.0.cache |__ 0    auth [dir 1 / [2,head] auth v=444 cv=0/0 dir_auth=0 state=1610612736 f(v0 m2016-08-04 17:24:31.430676 1=0+1) n(v6 rc2016-08-04 17:25:33.930024 b25257853 2556=2219+337) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 0xbbb6000]
   -14> 2016-08-04 17:25:40.602300 7f4c8b13c700  7 mds.0.cache  current root is [dir 1 / [2,head] auth v=444 cv=0/0 dir_auth=0 state=1610612736 f(v0 m2016-08-04 17:24:31.430676 1=0+1) n(v6 rc2016-08-04 17:25:33.930024 b25257853 2556=2219+337) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 0xbbb6000]
   -13> 2016-08-04 17:25:40.602310 7f4c8b13c700 10 mds.0.cache   new subtree at [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 dirty=1 0xbbb6610]
   -12> 2016-08-04 17:25:40.602336 7f4c8b13c700 10 mds.0.cache.dir(10000000000) setting dir_auth=-2,-2 from -1,-2 on [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 subtree=1 dirty=1 0xbbb6610]
   -11> 2016-08-04 17:25:40.602362 7f4c8b13c700 10 mds.0.cache.dir(10000000000)  new subtree root, adjusting auth_pins
   -10> 2016-08-04 17:25:40.602382 7f4c8b13c700 15 mds.0.cache show_subtrees
    -9> 2016-08-04 17:25:40.602402 7f4c8b13c700 10 mds.0.cache |____ 0    auth [dir 100 ~mds0/ [2,head] auth v=1 cv=0/0 dir_auth=0 state=1073741824 f(v0 10=0+10) n(v0 10=0+10) hs=0+0,ss=0+0 | subtree=1 0xbbb6308]
    -8> 2016-08-04 17:25:40.602424 7f4c8b13c700 10 mds.0.cache |_.__ 0    auth [dir 1 / [2,head] auth v=444 cv=0/0 dir_auth=0 state=1610612736 f(v0 m2016-08-04 17:24:31.430676 1=0+1) n(v6 rc2016-08-04 17:25:33.930024 b25257853 2556=2219+337) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 0xbbb6000]
    -7> 2016-08-04 17:25:40.602458 7f4c8b13c700 10 mds.0.cache   |__-2    auth [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 dir_auth=-2 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 subtree=1 dirty=1 0xbbb6610]
    -6> 2016-08-04 17:25:40.602487 7f4c8b13c700 10 mds.0.cache try_trim_nonauth_subtree [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 dir_auth=-2 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 subtree=1 dirty=1 0xbbb6610]
    -5> 2016-08-04 17:25:40.602505 7f4c8b13c700 10 mds.0.cache trim_non_auth_subtree(0xbbb6610) [dir 10000000000 /etc/ [2,head] auth v=4853 cv=0/0 dir_auth=-2 state=1610612736 f(v0 m2016-08-04 17:25:33.930024 162=34+128) n(v8 rc2016-08-04 17:25:33.930024 b25257853 2555=2219+336) hs=122+0,ss=0+0 dirty=36 | child=1 subtree=1 dirty=1 0xbbb6610]
    -4> 2016-08-04 17:25:40.602529 7f4c8b13c700 10 mds.0.cache trim_non_auth_subtree(0xbbb6610) Checking dentry 0xc71f690
    -3> 2016-08-04 17:25:40.602543 7f4c8b13c700 20 mds.0.cache trim_non_auth_subtree(0xbbb6610) removing inode 0xca3fa70 with dentry0xc71f690
    -2> 2016-08-04 17:25:40.602555 7f4c8b13c700 12 mds.0.cache.dir(10000000000) unlink_inode [dentry #1/etc/pcmcia [2,head] auth (dversion lock) v=1532 inode=0xca3fa70 0xc71f690] [inode 100000004c2 [...2,head] /etc/pcmcia/ auth v1532 f(v0 m2016-08-04 17:24:52.963509 1=1+0) n(v0 rc2016-08-04 17:24:52.963509 b1117 2=1+1) (iversion lock) 0xca3fa70]
    -1> 2016-08-04 17:25:40.602588 7f4c8b13c700 14 mds.0.cache remove_inode [inode 100000004c2 [...2,head] #100000004c2/ auth v1532 f(v0 m2016-08-04 17:24:52.963509 1=1+0) n(v0 rc2016-08-04 17:24:52.963509 b1117 2=1+1) (iversion lock) 0xca3fa70]
     0> 2016-08-04 17:25:40.778768 7f4c8b13c700 -1 /home/jspray/git/ceph/src/mds/MDCache.cc: In function 'bool MDCache::trim_non_auth_subtree(CDir*)' thread 7f4c8b13c700 time 2016-08-04 17:25:40.602607
/home/jspray/git/ceph/src/mds/MDCache.cc: 6869: FAILED assert(!dir->has_bloom())

 ceph version v11.0.0-1132-gfcdb1b1 (fcdb1b1bc1a20b2c9a7637af68fa55db5233994e)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x95) [0x194d292]
 2: (MDCache::trim_non_auth_subtree(CDir*)+0x80b) [0x157b00b]
 3: (MDCache::try_trim_non_auth_subtree(CDir*)+0x3e8) [0x157b9da]
 4: (EExport::replay(MDSRank*)+0x365) [0x18558a9]
 5: (MDLog::_replay_thread()+0x17e0) [0x1802524]
 6: (MDLog::ReplayThread::entry()+0x1c) [0x1429a4e]
 7: (Thread::entry_wrapper()+0xc1) [0x1aee961]
 8: (Thread::_entry_func(void*)+0x18) [0x1aee896]
 9: (()+0x7555) [0x7f4c986af555]
 10: (clone()+0x6d) [0x7f4c97538ded]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Seeing this in a 2xactive + 2xstandby-replay configuration, while copying in a deep tree (my trusty ol' /etc dir) with the MDS cache size set to 1000.

Seems to reproduce easily.

History

#1 Updated by John Spray about 3 years ago

  • Description updated (diff)

#2 Updated by Greg Farnum about 3 years ago

git blame points at b7e698a52bf7838f8e37842074c510a6561f165b from Zheng.

mds: no bloom filter for replica dir
We should delete dir fragment's bloom filter after exporting the dir
fragment to other MDS. Otherwise the residual bloom filter may cause

We'll need to add bloom filter removal to the journal/replay code; this patch didn't touch it. Wouldn't surprise me if we've got a fair number of similar issues from over the years... :/

#3 Updated by John Spray about 3 years ago

  • Assignee set to John Spray

#4 Updated by John Spray almost 3 years ago

  • Priority changed from Normal to High
  • Target version set to v12.0.0

#5 Updated by John Spray almost 3 years ago

  • Status changed from New to Need Review

#6 Updated by John Spray almost 3 years ago

  • Status changed from Need Review to Resolved

Not backporting because it's multi-mds

#7 Updated by Patrick Donnelly 6 months ago

  • Category deleted (90)
  • Labels (FS) multimds added

Also available in: Atom PDF