Actions
Bug #48517
closedmds: "CDir.cc: 1530: FAILED ceph_assert(!is_complete())"
Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash, qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2020-12-09T04:21:21.274+0000 7fea84b9c700 10 mds.0.openfiles prefetch_inodes 2020-12-09T04:21:21.274+0000 7fea84b9c700 10 mds.0.openfiles _prefetch_inodes state 1 2020-12-09T04:21:21.274+0000 7fea84b9c700 10 mds.0.openfiles _prefetch_dirfrags 2020-12-09T04:21:21.274+0000 7fea84b9c700 10 mds.0.cache.dir(0x1) fetch on [dir 0x1 / [2,head] auth v=488 cv=0/0 dir_auth=0 state=1610874880 f(v0 m2020-12-09T04:13:41.581699+0000 1=0+1) n(v33 rc2020-12-09T04:20:35.914083+0000 b51685884 8790=7766+1024) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 0x556e1f0f1180] 2020-12-09T04:21:21.274+0000 7fea84b9c700 10 mds.0.cache.dir(0x1) auth_pin by 0x556e1f0f1180 on [dir 0x1 / [2,head] auth v=488 cv=0/0 dir_auth=0 ap=1+0 state=1610874880 f(v0 m2020-12-09T04:13:41.581699+0000 1=0+1) n(v33 rc2020-12-09T04:20:35.914083+0000 b51685884 8790=7766+1024) hs=1+0,ss=0+0 dirty=1 | child=1 subtree=1 dirty=1 waiter=1 authpin=1 0x556e1f0f1180] count now 1 2020-12-09T04:21:21.274+0000 7fea84b9c700 1 -- [v2:172.21.15.73:6834/1493964564,v1:172.21.15.73:6835/1493964564] --> [v2:172.21.15.73:6824/16384,v1:172.21.15.73:6825/16384] -- osd_op(unknown.0.63:50 3.7 3:ff5b34d6:::1.00000000:head [omap-get-header,omap-get-vals in=16b,getxattr parent in=6b] snapc 0=[] ondisk+read+known_if_redirected+full_force e36) v8 -- 0x556e236b43c0 con 0x556e1f0f8000 2020-12-09T04:21:21.274+0000 7fea84b9c700 10 mds.0.cache.dir(0x1000000171d.0*) fetch on [dir 0x1000000171d.0* ~mds0/stray7/1000000171d/ [2,head] auth v=851 cv=0/0 state=1610612736 f()/f(v0 m2020-12-09T04:20:31.086183+0000) n(v5)/n(v5 rc2020-12-09T04:20:31.086183+0000) hs=0+244,ss=0+0 dirty=244 | child=1 dirty=1 0x556e241c9b00] 2020-12-09T04:21:21.274+0000 7fea84b9c700 7 mds.0.cache.dir(0x1000000171d.0*) fetch dirfrag for unlinked directory, mark complete 2020-12-09T04:21:21.274+0000 7fea84b9c700 10 mds.0.cache.dir(0x1000000171d.0*) fetch on [dir 0x1000000171d.0* ~mds0/stray7/1000000171d/ [2,head] auth v=851 cv=0/0 state=1610612737|complete f()/f(v0 m2020-12-09T04:20:31.086183+0000) n(v5)/n(v5 rc2020-12-09T04:20:31.086183+0000) hs=0+244,ss=0+0 dirty=244 | child=1 dirty=1 0x556e241c9b00] 2020-12-09T04:21:21.274+0000 7fea87ba2700 1 -- [v2:172.21.15.73:6834/1493964564,v1:172.21.15.73:6835/1493964564] <== osd.3 v2:172.21.15.73:6824/16384 3 ==== osd_op_reply(50 1.00000000 [omap-get-header out=274b,omap-get-vals out=5b,getxattr] v0'0 uv1 ondisk = 0) v8 ==== 238+0+279 (crc 0 0 0) 0x556e22af7cc0 con 0x556e1f0f8000 2020-12-09T04:21:21.274+0000 7fea7eb90700 10 MDSIOContextBase::complete: 21C_IO_Dir_OMAP_Fetched 2020-12-09T04:21:21.278+0000 7fea84b9c700 -1 /build/ceph-16.0.0-7967-ge4d2d676/src/mds/CDir.cc: In function 'void CDir::fetch(MDSContext*, std::string_view, bool)' thread 7fea84b9c700 time 2020-12-09T04:21:21.280738+0000 /build/ceph-16.0.0-7967-ge4d2d676/src/mds/CDir.cc: 1530: FAILED ceph_assert(!is_complete()) ceph version 16.0.0-7967-ge4d2d676 (e4d2d67692ca960325f1d91d5ef9332c7cb9f37d) pacific (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14b) [0x7fea8cea1e2d] 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7fea8cea2008] 3: (CDir::fetch(MDSContext*, std::basic_string_view<char, std::char_traits<char> >, bool)+0xc29) [0x556e1ddae5b9] 4: (CDir::fetch(MDSContext*, bool)+0x3a) [0x556e1ddae6da] 5: (OpenFileTable::_prefetch_dirfrags()+0x4cc) [0x556e1de766dc] 6: (OpenFileTable::_open_ino_finish(inodeno_t, int)+0x246) [0x556e1de77506] 7: (OpenFileTable::_prefetch_inodes()+0x280) [0x556e1de75db0] 8: (OpenFileTable::prefetch_inodes()+0xf8) [0x556e1de77728] 9: (MDCache::process_imported_caps()+0x184) [0x556e1dcb2634] 10: (MDCache::rejoin_start(MDSContext*)+0x82) [0x556e1dcb40b2] 11: (MDSRank::rejoin_start()+0x162) [0x556e1db5aba2] ...
From: /ceph/teuthology-archive/pdonnell-2020-12-09_02:22:27-fs-wip-pdonnell-testing-20201207.220433-distro-basic-smithi/5694731/remote/smithi073/log/ceph-mds.a.log.gz
It seems we're doing two fetches on the same dirfrag. It might be we need to make this a std::set:
however I'm not yet certain it's not a bug that we're adding the same dirfrag to the vector twice.
Actions