Actions
Bug #48148
openmds: Server.cc:6764 FAILED assert(in->filelock.can_read(mdr->get_client()))
% Done:
0%
Source:
Tags:
Backport:
pacific,octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
In my cluster with a single MDS, ceph version is 12.2.13, Assert will be encountered when a large number of deletion operations are performed. Now I can't reproduce it, so I didn't catch more logs.
backtrace:
0> 2020-11-03 15:32:35.316352 7f47dd5aa700 -1 /share/ceph/rpmbuild/BUILD/ceph-12.2.13/src/mds/Server.cc: In function 'bool Server::_dir_is_nonempty(MDRequestRef&, CInode*)' thread 7f47dd5aa700 time 2020-11-03 15:32:35.311722 /share/ceph/rpmbuild/BUILD/ceph-12.2.13/src/mds/Server.cc: 6783: FAILED assert(in->filelock.can_read(mdr->get_client())) ceph version 12.2.13-1-560-g87ea0b6 (87ea0b6e94eaa3544572dd676db0e8932f56d7a8) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x110) [0x55c2f7b60640] 2: (Server::_dir_is_nonempty(boost::intrusive_ptr<MDRequestImpl>&, CInode*)+0x1a8) [0x55c2f7813ab8] 3: (Server::handle_client_unlink(boost::intrusive_ptr<MDRequestImpl>&)+0x13df) [0x55c2f7843bef] 4: (Server::dispatch_client_request(boost::intrusive_ptr<MDRequestImpl>&)+0xdb9) [0x55c2f7869479] 5: (MDSInternalContextBase::complete(int)+0x1fb) [0x55c2f7a9c34b] 6: (void finish_contexts<MDSInternalContextBase>(CephContext*, std::list<MDSInternalContextBase*, std::allocator<MDSInternalContextBase*> >&, int)+0x16c) [0x55c2f77c10bc] 7: (MDSCacheObject::finish_waiting(unsigned long, int)+0x46) [0x55c2f7ab66e6] 8: (Locker::eval_gather(SimpleLock*, bool, bool*, std::list<MDSInternalContextBase*, std::allocator<MDSInternalContextBase*> >*)+0x124f) [0x55c2f798958f] 9: (Locker::wrlock_finish(SimpleLock*, MutationImpl*, bool*)+0x341) [0x55c2f798b261] 10: (Locker::_drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x22c) [0x55c2f798f14c] 11: (Locker::drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x76) [0x55c2f798f586] 12: (Locker::scatter_writebehind_finish(ScatterLock*, boost::intrusive_ptr<MutationImpl>&)+0xd0) [0x55c2f798f6d0] 13: (MDSIOContextBase::complete(int)+0xa5) [0x55c2f7a9c4e5] 14: (MDSLogContextBase::complete(int)+0x3c) [0x55c2f7a9caec] 15: (Finisher::finisher_thread_entry()+0x198) [0x55c2f7b5f2d8] 16: (()+0x7e65) [0x7f47e9f64e65] 17: (clone()+0x6d) [0x7f47e92588ad] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
I have a suspicion. I'm not sure whether this problem is related to it.
In Server::handle_client_unlinkļ¼first, rdlock the filelock of the inode to be deleted, and then in Server::_dir_is_nonempty, confirm that filelock can read through "assert(in->filelock.can_read(mdr->get_client()))".But filelock allows can_rdlock but not allow can_read in two states, like the following two states:
[LOCK_EXCL] = { 0, true, LOCK_LOCK, 0, 0, XCL, XCL, 0, 0, 0, [LOCK_EXCL_XSYN] = { LOCK_XSYN, false, LOCK_LOCK, 0, 0, XCL, 0, 0, 0, 0,
If the filelock is in these two states, will the above assert appears?
Actions