Actions
Bug #9540
closedCrash during FS upgrade: assert(o->get_num_ref() == 0)
Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
-10> 2014-09-18 07:05:14.008418 7f71ab92d700 10 mds.0.locker mark_updated_scatterlock (ifile sync dirty) - already on list since 2014-09-18 07:05:11.804950 -9> 2014-09-18 07:05:14.008420 7f71ab92d700 10 mds.0.journal EMetaBlob.replay updated dir [dir 600 ~mds0/stray0/ [2,head] auth v=7891 cv=0/0 state=1610612736 f(v1 m2014-09-18 06:58:11.104563 136=2+134)/f(v1 m2014-09-18 06:58:11.104563 139=5+134) n(v2 rc2014-09-18 06:58:11.104563 b8388608 136=2+134)/n(v2 rc2014-09-18 06:58:11.104563 b10194500 139=5+134) hs=115+304,ss=0+0 dirty=419 | child=1 dirty=1 0x6399d90] -8> 2014-09-18 07:05:14.008431 7f71ab92d700 10 mds.0.journal EMetaBlob.replay unlinking [dentry #100/stray0/10000000f8b [2,head] auth (dversion lock) v=7882 inode=0x76ea338 | inodepin=1 dirty=1 0x7b042e0] -7> 2014-09-18 07:05:14.008434 7f71ab92d700 12 mds.0.cache.dir(600) unlink_inode [dentry #100/stray0/10000000f8b [2,head] auth (dversion lock) v=7882 inode=0x76ea338 | inodepin=1 dirty=1 0x7b042e0] [inode 10000000f8b [2,head] ~mds0/stray0/10000000f8b auth v7882 dirtyparent s=1805892 nl=0 n(v0 b1805892 1=1+0) (iversion lock) | truncating=1 dirtyparent=1 dirty=1 0x76ea338] -6> 2014-09-18 07:05:14.008442 7f71ab92d700 10 mds.0.journal EMetaBlob.replay had [dentry #100/stray0/10000000f8b [2,head] auth NULL (dversion lock) v=7890 inode=0 | inodepin=0 dirty=1 0x7b042e0] -5> 2014-09-18 07:05:14.008445 7f71ab92d700 10 mds.0.journal unlinked set contains {0x76ea338=0x6399d90} -4> 2014-09-18 07:05:14.008446 7f71ab92d700 10 mds.0.cache remove_inode_recursive [inode 10000000f8b [2,head] #10000000f8b auth v7882 dirtyparent s=1805892 nl=0 n(v0 b1805892 1=1+0) (iversion lock) | truncating=1 dirtyparent=1 dirty=1 0x76ea338] -3> 2014-09-18 07:05:14.008450 7f71ab92d700 14 mds.0.cache remove_inode [inode 10000000f8b [2,head] #10000000f8b auth v7882 dirtyparent s=1805892 nl=0 n(v0 b1805892 1=1+0) (iversion lock) | truncating=1 dirtyparent=1 dirty=1 0x76ea338] -2> 2014-09-18 07:05:14.008454 7f71ab92d700 10 mds.0.cache.ino(10000000f8b) mark_clean [inode 10000000f8b [2,head] #10000000f8b auth v7882 dirtyparent s=1805892 nl=0 n(v0 b1805892 1=1+0) (iversion lock) | truncating=1 dirtyparent=1 dirty=1 0x76ea338] -1> 2014-09-18 07:05:14.008458 7f71ab92d700 10 mds.0.cache.ino(10000000f8b) clear_dirty_parent 0> 2014-09-18 07:05:14.009733 7f71ab92d700 -1 mds/MDCache.cc: In function 'void MDCache::remove_inode(CInode*)' thread 7f71ab92d700 time 2014-09-18 07:05:14.008467 mds/MDCache.cc: 310: FAILED assert(o->get_num_ref() == 0) ceph version 0.85-723-g83bd343 (83bd3430e3a17b77265e696095904b7a9032d2ee) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x7f) [0x90eaff] 2: (MDCache::remove_inode(CInode*)+0x782) [0x630152] 3: (MDCache::remove_inode_recursive(CInode*)+0x288) [0x63d5c8] 4: (EMetaBlob::replay(MDS*, LogSegment*, MDSlaveUpdate*)+0x43cd) [0x815b7d] 5: (EUpdate::replay(MDS*)+0x3a) [0x81e79a] 6: (MDLog::_replay_thread()+0x698) [0x7a3168] 7: (MDLog::ReplayThread::entry()+0xd) [0x5a099d] 8: (()+0x7e9a) [0x7f71b5d22e9a] 9: (clone()+0x6d) [0x7f71b48d73fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
YAML that hit this (once, haven't tried again yet):
interactive-on-error: true overrides: ceph: conf: mds: debug mds: 20 mds verify scatter: false client: debug client: 20 mon: mon warn on legacy crush tunables: false log-whitelist: - scrub fs: xfs roles: - - mon.a - mds.a - osd.0 - osd.1 - osd.2 - client.0 - - mon.b - mon.c - osd.3 - osd.4 - osd.5 - mds.a-s - client.1 # Client 0 will remain mounted continuously # Client 1 will be remounted after each upgrade. # Both will experience the same workloads tasks: - install: branch: emperor - print: "**** done emperor install" - ceph: fs: xfs - print: "**** done ceph cluster setup" - ceph-fuse: - workunit: clients: all: - suites/fsstress.sh #- fs/misc/trivial_sync.sh - print: "**** done workunit on emperor" - install.upgrade: all: branch: firefly - ceph-fuse: client.1: mounted: false - ceph.restart: - ceph-fuse: client.1: mounted: true - workunit: clients: all: - suites/fsstress.sh #- fs/misc/trivial_sync.sh - print: "**** done workunit on firefly" - install.upgrade: all: sha1: 83bd3430e3a17b77265e696095904b7a9032d2ee - ceph-fuse: client.1: mounted: false - ceph.restart: - ceph-fuse: client.1: mounted: true - workunit: clients: all: - suites/fsstress.sh #- fs/misc/trivial_sync.sh - print: "**** done workunit on latest" - interactive:
Actions