Actions
Bug #55170
closedmds: crash during rejoin (CDir::fetch_keys)
Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
-5> 2022-04-03T18:21:56.273+0000 7f339054b700 10 mds.1.cache rejoin_gather_finish -4> 2022-04-03T18:21:56.273+0000 7f339054b700 10 mds.1.cache open_undef_inodes_dirfrags 21 inodes 0 dirfrags -3> 2022-04-03T18:21:56.273+0000 7f339054b700 10 mds.1.cache.dir(0x613.111*) fetch_keys 0 keys on [dir 0x613.111* ~mds1/stray9/ [2,head] auth{0=6} v=21060 cv=0/0 state=1610612736 f(v0 m2022-04-03T18:08:26.278322+0000 22=0+22)/f(v0 m2022-04-03T18:08:26.278322+0000 64=42+22) n(v3 rc2022-04-03T18:08:26.278322+0000 22=0+22) hs=22+406,ss=0+0 dirty=425 | child=1 replicated=1 dirty=1 0x558167b94480] -2> 2022-04-03T18:21:56.273+0000 7f339054b700 7 mds.1.cache.dir(0x613.111*) fetch keys, all are already being fetched -1> 2022-04-03T18:21:56.275+0000 7f339054b700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-11442-gcfb8f943/rpm/el8/BUILD/ceph-17.0.0-11442-gcfb8f943/src/mds/CDir.cc: In function 'void CDir::fetch_keys(const std::vector<dentry_key_t>&, MDSContext*)' thread 7f339054b700 time 2022-04-03T18:21:56.275271+0000 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-11442-gcfb8f943/rpm/el8/BUILD/ceph-17.0.0-11442-gcfb8f943/src/mds/CDir.cc: 1640: FAILED ceph_assert(!c) ceph version 17.0.0-11442-gcfb8f943 (cfb8f943163b374162da0d7b0240f267dd46e4e1) quincy (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x7f3398d6a144] 2: /usr/lib64/ceph/libceph-common.so.2(+0x284365) [0x7f3398d6a365] 3: (CDir::fetch_keys(std::vector<dentry_key_t, std::allocator<dentry_key_t> > const&, MDSContext*)+0x43c) [0x55816165542c] 4: (MDCache::open_undef_inodes_dirfrags()+0x6d6) [0x5581615339c6] 5: (MDCache::rejoin_gather_finish()+0xa8) [0x558161540f78] 6: (MDCache::handle_cache_rejoin_strong(boost::intrusive_ptr<MMDSCacheRejoin const> const&)+0x30f1) [0x55816154cbc1] 7: (MDCache::handle_cache_rejoin(boost::intrusive_ptr<MMDSCacheRejoin const> const&)+0xdb) [0x558161550d0b] 8: (MDCache::dispatch(boost::intrusive_ptr<Message const> const&)+0x354) [0x558161551214] 9: (MDSRank::handle_message(boost::intrusive_ptr<Message const> const&)+0x942) [0x5581613d8502] 10: (MDSRank::_dispatch(boost::intrusive_ptr<Message const> const&, bool)+0x7cb) [0x5581613db54b] 11: (MDSRankDispatcher::ms_dispatch(boost::intrusive_ptr<Message const> const&)+0x5c) [0x5581613dbb6c] 12: (MDSDaemon::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x108) [0x5581613caa68] 13: (DispatchQueue::entry()+0x14fa) [0x7f3398ff1f7a] 14: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f33990a9651] 15: /lib64/libpthread.so.0(+0x814a) [0x7f3397d4314a] 16: clone() 0> 2022-04-03T18:21:56.278+0000 7f339054b700 -1 *** Caught signal (Aborted) ** in thread 7f339054b700 thread_name:ms_dispatch
Test matrix:
Description: fs/thrash/workloads/{begin/{0-install 1-ceph 2-logrotate} clusters/1a5s-mds-1c-client conf/{client mds mon osd} distro/{rhel_8} mount/fuse msgr-failures/osd-mds-delay objectstore-ec/bluestore-comp-ec-root overrides/{frag prefetch_dirfrags/no prefetch_entire_dirfrags/no races session_timeout thrashosds-health whitelist_health whitelist_wrongly_marked_down} ranks/3 tasks/{1-thrash/osd 2-workunit/fs/snaps}}
Crash seems unrealted to PRs being tested in the branch.
Updated by Venky Shankar about 2 years ago
- Status changed from New to Triaged
- Assignee set to Venky Shankar
Updated by Venky Shankar almost 2 years ago
- Status changed from Triaged to Fix Under Review
- Pull request ID set to 46063
Updated by Venky Shankar almost 2 years ago
- Status changed from Fix Under Review to Resolved
Actions