Project

General

Profile

Actions

Bug #44565

closed

src/mds/SimpleLock.h: 528: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())

Added by Jan Fajerski about 4 years ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Development
Tags:
backport_processed
Backport:
reef,quincy,pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Seeing frequent MDS daemons crashes in a multi-active setup. The crashes often coincide with inode migrations, but not sure that is related.
I requested coredumps for these crashes. Log of one MDS is attached.


Files

ceph-mds.stmailmds01a.log-20200227.xz (339 KB) ceph-mds.stmailmds01a.log-20200227.xz MDS log Jan Fajerski, 03/11/2020 03:25 PM

Related issues 6 (0 open6 closed)

Related to CephFS - Bug #54976: mds: Test failure: test_filelock_eviction (tasks.cephfs.test_client_recovery.TestClientRecovery)DuplicateXiubo Li

Actions
Has duplicate CephFS - Bug #57411: mutiple mds crash seen while running db workloads with regular snapshots and journal flush.Duplicate

Actions
Has duplicate CephFS - Bug #61967: mds: "SimpleLock.h: 417: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())"Duplicate

Actions
Copied to CephFS - Backport #62522: quincy: src/mds/SimpleLock.h: 528: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())ResolvedXiubo LiActions
Copied to CephFS - Backport #62523: pacific: src/mds/SimpleLock.h: 528: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())ResolvedXiubo LiActions
Copied to CephFS - Backport #62524: reef: src/mds/SimpleLock.h: 528: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())ResolvedXiubo LiActions
Actions #1

Updated by Zheng Yan about 4 years ago

If you have coredump, please use gdb to print value of 'state'. Besides ceph was compiled from 8881d33957b54b101eae9c7627b351af10e87ee8. I can't it in upstream ceph repository

Actions #2

Updated by Zheng Yan about 4 years ago

  • Assignee set to Zheng Yan
Actions #3

Updated by Jan Fajerski about 4 years ago

Sorry this commit comes from github.com/SUSE/ceph

The this build corresponds to commit daf0990c19c89267ea10c40b9c7a48bea49a3d1b upstream.

Will update with coredumps as soon as I get them.

Actions #4

Updated by Jan Fajerski about 4 years ago

#bt
#0  raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#1  0x000055d7ec73f810 in reraise_fatal (signum=6) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/global/signal_handler.cc:81
#2  handle_fatal_signal (signum=6) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/global/signal_handler.cc:326
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:51
#5  0x00007fdc4a88ab01 in __GI_abort () at abort.c:79
#6  0x00007fdc4be56983 in ceph::__ceph_assert_fail (assertion=<optimized out>, file=<optimized out>, line=<optimized out>, func=0x55d7ec785150 <SimpleLock::put_xlock()::__PRETTY_FUNCTION__> "void SimpleLock::put_xlock()")
    at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/common/assert.cc:73
#7  0x00007fdc4be56b0d in ceph::__ceph_assert_fail (ctx=...) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/common/assert.cc:78
#8  0x000055d7ec5c16db in SimpleLock::put_xlock (this=0x55d810d29388) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/SimpleLock.h:525
#9  0x000055d7ec5b5fe1 in Locker::xlock_finish (this=this@entry=0x55d7ed6e7220, it=..., mut=mut@entry=0x55d80cb9d800, pneed_issue=pneed_issue@entry=0x7fdc3c6b5b9f)
    at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/Locker.cc:1724
#10 0x000055d7ec5b661b in Locker::_drop_locks (this=this@entry=0x55d7ed6e7220, mut=0x55d80cb9d800, pneed_issue=pneed_issue@entry=0x7fdc3c6b5c50, drop_rdlocks=drop_rdlocks@entry=false)
    at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/Locker.cc:672
#11 0x000055d7ec5b6e8a in Locker::drop_non_rdlocks (this=0x55d7ed6e7220, mut=<optimized out>, pneed_issue=0x7fdc3c6b5c50, pneed_issue@entry=0x0) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/Locker.cc:756
#12 0x000055d7ec506b29 in MDCache::request_drop_non_rdlocks (this=<optimized out>, mdr=...) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/MDCache.cc:9494
#13 0x000055d7ec467b47 in Server::reply_client_request (this=this@entry=0x55d7ed7be640, mdr=..., reply=...) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/Server.cc:1990
#14 0x000055d7ec468639 in Server::respond_to_request (this=this@entry=0x55d7ed7be640, mdr=..., r=r@entry=0) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/Server.cc:1750
#15 0x000055d7ec480390 in Server::_link_local_finish (this=0x55d7ed7be640, mdr=..., dn=0x55d7f8249a20, targeti=0x55d810d28e00, dnpv=84713568, tipv=<optimized out>, adjust_realm=true)
    at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/Server.cc:6087
#16 0x000055d7ec4cc533 in C_MDS_link_local_finish::finish (this=<optimized out>, r=<optimized out>) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/Server.cc:6009
#17 0x000055d7ec6b2d03 in Context::complete (r=0, this=0x55d80f596280) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/include/Context.h:77
#18 MDSContext::complete (this=this@entry=0x55d80f596280, r=r@entry=0) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/MDSContext.cc:29
#19 0x000055d7ec6b2fab in MDSIOContextBase::complete (this=0x55d80f596280, r=r@entry=0) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/MDSContext.cc:106
#20 0x000055d7ec6b3210 in MDSLogContextBase::complete (this=<optimized out>, r=0) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/MDSContext.cc:115
#21 0x00007fdc4be9f20e in Finisher::finisher_thread_entry (this=0x55d7ed744c00) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/common/Finisher.cc:67
#22 0x00007fdc4b7484f9 in start_thread (arg=0x7fdc3c6b8700) at pthread_create.c:465
#23 0x00007fdc4a94beef in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
#frame 8
#8  0x000055d7ec5c16db in SimpleLock::put_xlock (this=0x55d810d29388) at /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/SimpleLock.h:525
525    in /usr/src/debug/ceph-14.2.5.382+g8881d33957-3.30.1.x86_64/src/mds/SimpleLock.h
#p state
$6 = 1

Let me know if this helps.

Actions #5

Updated by Zheng Yan about 4 years ago

Lock state is SYNC. I checked code, can't figure out why. could you try reproducing this with debug_mds=10.

Actions #6

Updated by Patrick Donnelly almost 4 years ago

  • Status changed from New to Need More Info
  • Assignee deleted (Zheng Yan)
Actions #7

Updated by Dan van der Ster over 3 years ago

  • Affected Versions v14.2.11 added

We have just hit this in 14.2.11, with 3 active mds.
mds log is at ceph-post-file: b1a56b74-6fbe-41bb-adcd-183695c39ff1

In our case state = 3:

(gdb) bt
#0  0x00007f314f7374fb in raise () from /lib64/libpthread.so.0
#1  0x000055abbc3c4712 in reraise_fatal (signum=6) at /usr/src/debug/ceph-14.2.11/src/global/signal_handler.cc:326
#2  handle_fatal_signal(int) () at /usr/src/debug/ceph-14.2.11/src/global/signal_handler.cc:326
#3  <signal handler called>
#4  0x00007f314e315387 in raise () from /lib64/libc.so.6
#5  0x00007f314e316a78 in abort () from /lib64/libc.so.6
#6  0x00007f3151879074 in ceph::__ceph_assert_fail (assertion=<optimized out>, file=<optimized out>, line=<optimized out>, func=<optimized out>)
    at /usr/src/debug/ceph-14.2.11/src/common/assert.cc:73
#7  0x00007f31518791ed in ceph::__ceph_assert_fail (ctx=...) at /usr/src/debug/ceph-14.2.11/src/common/assert.cc:78
#8  0x000055abbc261c8b in SimpleLock::put_xlock (this=0x55ac2dff2830) at /usr/src/debug/ceph-14.2.11/src/common/TrackedOp.h:391
#9  0x000055abbc2568db in Locker::xlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*, bool*) ()
    at /usr/src/debug/ceph-14.2.11/src/mds/Locker.cc:1725
#10 0x000055abbc256ecb in Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool) ()
    at /usr/src/debug/ceph-14.2.11/src/mds/Locker.cc:672
#11 0x000055abbc2575ea in Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*) ()
    at /usr/src/debug/ceph-14.2.11/src/mds/Locker.cc:756
#12 0x000055abbc118994 in Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&) ()
    at /usr/src/debug/ceph-14.2.11/src/mds/Server.cc:2028
#13 0x000055abbc1192e9 in Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int) () at /usr/src/debug/ceph-14.2.11/src/messages/MClientReply.h:299
#14 0x000055abbc12eafa in Server::_link_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CInode*, unsigned long, unsigned long, bool) ()
    at /usr/src/debug/ceph-14.2.11/src/mds/Server.cc:6132
#15 0x000055abbc17a093 in C_MDS_link_local_finish::finish (this=<optimized out>, r=<optimized out>) at /usr/src/debug/ceph-14.2.11/src/mds/Server.cc:6054
#16 0x000055abbc340064 in complete (r=0, this=0x55ac88d28180) at /usr/src/debug/ceph-14.2.11/src/include/Context.h:77
---Type <return> to continue, or q <return> to quit---q
Quit
(gdb) frame 8
#8  0x000055abbc261c8b in SimpleLock::put_xlock (this=0x55ac2dff2830) at /usr/src/debug/ceph-14.2.11/src/common/TrackedOp.h:391
391        o->put();
(gdb) info locals
__PRETTY_FUNCTION__ = "void SimpleLock::put_xlock()" 
assert_data_ctx = {
  assertion = 0x55abbc4055c0 "state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock()", 
  file = 0x55abbc3f87c0 "/builddir/build/BUILD/ceph-14.2.11/src/mds/SimpleLock.h", line = 528, 
  function = 0x55abbc4090f0 <SimpleLock::put_xlock()::__PRETTY_FUNCTION__> "void SimpleLock::put_xlock()"}
(gdb) p state
$1 = 3
Actions #8

Updated by Jan Fajerski over 3 years ago

  • Status changed from Need More Info to In Progress
  • Affected Versions v14.2.5 added
  • Affected Versions deleted (v12.2.5)
Actions #9

Updated by Venky Shankar about 2 years ago

  • Status changed from In Progress to New

Hit this here: https://pulpito.ceph.com/vshankar-2022-03-18_02:56:29-fs:upgrade-wip-vshankar-testing-20220317-101203-testing-default-smithi/6743898/

fs:upgrade test, crashed when mds got upgraded to quincy: ceph version 17.0.0-11165-gc7bc6ea9 (c7bc6ea9ed836057fd41cf9929400b2956af6d42) quincy (dev), process ceph-mds, pid 39274

Actions #10

Updated by Xiubo Li about 2 years ago

  • Related to Bug #54976: mds: Test failure: test_filelock_eviction (tasks.cephfs.test_client_recovery.TestClientRecovery) added
Actions #11

Updated by Kotresh Hiremath Ravishankar about 2 years ago

  • Assignee set to Kotresh Hiremath Ravishankar
Actions #12

Updated by Patrick Donnelly over 1 year ago

  • Has duplicate Bug #57411: mutiple mds crash seen while running db workloads with regular snapshots and journal flush. added
Actions #14

Updated by Xiubo Li 10 months ago

https://pulpito.ceph.com/xiubli-2023-06-08_03:32:08-fs:functional-wip-lxb-fscrypt-20230607-0901-distro-default-smithi/7298110/
https://pulpito.ceph.com/xiubli-2023-06-08_03:32:08-fs:functional-wip-lxb-fscrypt-20230607-0901-distro-default-smithi/7298112/

All the failure were caused by generic/088, something likes:

2023-06-08T04:43:12.959 DEBUG:teuthology.orchestra.run.smithi062:> (cd /tmp/tmp.9QEZAyoOKCxfstests-dev && exec sudo env DIFF_LENGTH=0 ./check generic/088)
2023-06-08T04:43:13.484 INFO:teuthology.orchestra.run.smithi062.stdout:FSTYP         -- ceph-fuse
2023-06-08T04:43:13.488 INFO:teuthology.orchestra.run.smithi062.stdout:PLATFORM      -- Linux/x86_64 smithi062 5.15.0-73-generic #80-Ubuntu SMP Mon May 15 15:18:26 UTC 2023
2023-06-08T04:43:13.488 INFO:teuthology.orchestra.run.smithi062.stdout:
2023-06-08T04:43:13.656 INFO:teuthology.orchestra.run.smithi062.stdout:generic/088        0s
2023-06-08T04:43:13.661 INFO:teuthology.orchestra.run.smithi062.stdout:Ran: generic/088
2023-06-08T04:43:13.661 INFO:teuthology.orchestra.run.smithi062.stdout:Passed all 1 tests
2023-06-08T04:43:13.664 INFO:teuthology.orchestra.run.smithi062.stdout:
2023-06-08T04:43:13.671 INFO:tasks.cephfs.xfstests_dev:Command return value: 0
2023-06-08T04:43:13.672 DEBUG:teuthology.orchestra.run.smithi062:> (cd /tmp/tmp.9QEZAyoOKCxfstests-dev && exec sudo env DIFF_LENGTH=0 ./check generic/089)
2023-06-08T04:43:14.176 INFO:teuthology.orchestra.run.smithi062.stdout:FSTYP         -- ceph-fuse
2023-06-08T04:43:14.180 INFO:teuthology.orchestra.run.smithi062.stdout:PLATFORM      -- Linux/x86_64 smithi062 5.15.0-73-generic #80-Ubuntu SMP Mon May 15 15:18:26 UTC 2023
2023-06-08T04:43:14.180 INFO:teuthology.orchestra.run.smithi062.stdout:
2023-06-08T04:43:20.579 INFO:tasks.ceph.mds.c.smithi062.stderr:./src/mds/SimpleLock.h: In function 'void SimpleLock::put_xlock()' thread 7fa3a03ed640 time 2023-06-08T04:43:20.578563+0000
2023-06-08T04:43:20.579 INFO:tasks.ceph.mds.c.smithi062.stderr:./src/mds/SimpleLock.h: 417: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())
2023-06-08T04:43:20.586 INFO:tasks.ceph.mds.c.smithi062.stderr: ceph version 18.0.0-4313-ga02638f8 (a02638f808b8fe11bccdd01222c0e6e2eb71120b) reef (dev)
2023-06-08T04:43:20.586 INFO:tasks.ceph.mds.c.smithi062.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x121) [0x7fa3ac5376e1]
2023-06-08T04:43:20.586 INFO:tasks.ceph.mds.c.smithi062.stderr: 2: /usr/lib/ceph/libceph-common.so.2(+0x161895) [0x7fa3ac537895]
2023-06-08T04:43:20.586 INFO:tasks.ceph.mds.c.smithi062.stderr: 3: ceph-mds(+0x21b10b) [0x55f41b20210b]
2023-06-08T04:43:20.586 INFO:tasks.ceph.mds.c.smithi062.stderr: 4: (Locker::xlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*, bool*)+0x8d) [0x55f41b2bb73d]
2023-06-08T04:43:20.587 INFO:tasks.ceph.mds.c.smithi062.stderr: 5: (Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x47d) [0x55f41b2ada9d]
2023-06-08T04:43:20.587 INFO:tasks.ceph.mds.c.smithi062.stderr: 6: (Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x63) [0x55f41b2ae193]
2023-06-08T04:43:20.587 INFO:tasks.ceph.mds.c.smithi062.stderr: 7: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x428) [0x55f41b18fb18]
2023-06-08T04:43:20.587 INFO:tasks.ceph.mds.c.smithi062.stderr: 8: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x11e) [0x55f41b184cfe]
2023-06-08T04:43:20.587 INFO:tasks.ceph.mds.c.smithi062.stderr: 9: (Server::_unlink_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*, unsigned long)+0x310) [0x55f41b1bc620]
2023-06-08T04:43:20.587 INFO:tasks.ceph.mds.c.smithi062.stderr: 10: (MDSContext::complete(int)+0x5c) [0x55f41b3aa66c]
2023-06-08T04:43:20.587 INFO:tasks.ceph.mds.c.smithi062.stderr: 11: (MDSIOContextBase::complete(int)+0x30c) [0x55f41b3aaa9c]
2023-06-08T04:43:20.587 INFO:tasks.ceph.mds.c.smithi062.stderr: 12: (MDSLogContextBase::complete(int)+0x53) [0x55f41b3aabd3]
2023-06-08T04:43:20.588 INFO:tasks.ceph.mds.c.smithi062.stderr: 13: (Finisher::finisher_thread_entry()+0x175) [0x7fa3ac5ea725]
2023-06-08T04:43:20.588 INFO:tasks.ceph.mds.c.smithi062.stderr: 14: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7fa3abeb1b43]
2023-06-08T04:43:20.588 INFO:tasks.ceph.mds.c.smithi062.stderr: 15: /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7fa3abf43a00]
2023-06-08T04:43:20.588 INFO:tasks.ceph.mds.c.smithi062.stderr:*** Caught signal (Aborted) **
Actions #15

Updated by Xiubo Li 10 months ago

Almost all the tests failed with this same issue, recently this issue could be reproduce very easy with xfstests test tool, please see:

https://pulpito.ceph.com/xiubli-2023-06-09_04:16:27-fs:functional-wip-lxb-fscrypt-20230607-0901-distro-default-smithi/

Actions #16

Updated by Xiubo Li 10 months ago

Talked with Kotresh, I will work on this.

Actions #17

Updated by Xiubo Li 10 months ago

  • Assignee changed from Kotresh Hiremath Ravishankar to Xiubo Li
Actions #18

Updated by Xiubo Li 10 months ago

Xiubo Li wrote:

https://pulpito.ceph.com/xiubli-2023-06-08_03:32:08-fs:functional-wip-lxb-fscrypt-20230607-0901-distro-default-smithi/7298110/
https://pulpito.ceph.com/xiubli-2023-06-08_03:32:08-fs:functional-wip-lxb-fscrypt-20230607-0901-distro-default-smithi/7298112/

For the inode 0x10000009c8c:

Firstly there was one link request came:

2023-06-08T04:35:31.730+0000 7f7ff82dc700  7 mds.0.server dispatch_client_request client_request(client.6366:5165 link #0x1000000004a/t_mtab~ #0x10000009c8c 2023-06-08T04:35:31.723602+0000 caller_uid=0, caller_gid=0{0,}) v6
2023-06-08T04:35:31.730+0000 7f7ff82dc700  7 mds.0.server handle_client_link #0x1000000004a/t_mtab~ to #0x10000009c8c
2023-06-08T04:35:31.730+0000 7f7ff82dc700 10 mds.0.server rdlock_path_xlock_dentry request(client.6366:5165 nref=4 cr=0x564f2ec69200) #0x1000000004a/t_mtab~
2023-06-08T04:35:31.730+0000 7f7ff82dc700  7 mds.0.server handle_client_link link t_mtab~ in [dir 0x1000000004a /test/ [2,head] auth pv=141136 v=141119 cv=0/0 ap=9+14 state=1610874881|complete f(v0 m2023-06-08T04:35:31.648959+0000 2=2+0)->f(v0 m2023-06-08T04:35:31.716381+0000 2=2+0) n(v0 rc2023-06-08T04:35:31.648959+0000 b3909 1=1+0)->n(v0 rc2023-06-08T04:35:31.716381+0000 b3909 2=2+0) hs=2+6,ss=0+0 dirty=7 | child=1 dirty=1 authpin=1 0x564f2ecfd200]
2023-06-08T04:35:31.730+0000 7f7ff82dc700  7 mds.0.server target is [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141135 ap=3 s=0 n(v0 1=1+0)->n(v0 1=1+0) (ifile excl w=1) (iversion lock) cr={6366=0-4194304@1} caps={6366=pAsXsFs/-@2},l=6366(-1) | ptrwaiter=1 request=2 lock=2 caps=1 waiter=0 authpin=1 0x564f2ae6b080]

And the inode 0x10000009c8c's ilink Locker is in stable state as above.

Then the link request acquired the xlock for ilink Locker, which is [ilink xlock x=1 by 0x564f2f0f0800]:

2023-06-08T04:35:31.730+0000 7f7ff82dc700  7 mds.0.locker xlock_start on (ilink sync) on [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141135 ap=3 s=0 n(v0 1=1+0)->n(v0 1=1+0) (ifile excl w=1) (iversion lock) cr={6366=0-4194304@1} caps={6366=pAsXsFs/-@2},l=6366(-1) | ptrwaiter=1 request=2 lock=2 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.730+0000 7f7ff82dc700  7 mds.0.locker simple_lock on (ilink sync) on [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141135 ap=3 s=0 n(v0 1=1+0)->n(v0 1=1+0) (ifile excl w=1) (iversion lock) cr={6366=0-4194304@1} caps={6366=pAsXsFs/-@2},l=6366(-1) | ptrwaiter=1 request=2 lock=2 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.730+0000 7f7ff82dc700  7 mds.0.locker simple_xlock on (ilink lock) on [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141135 ap=3 s=0 n(v0 1=1+0)->n(v0 1=1+0) (ilink lock) (ifile excl w=1) (iversion lock) cr={6366=0-4194304@1} caps={6366=pAsXsFs/-@2},l=6366(-1) | ptrwaiter=1 request=2 lock=2 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.730+0000 7f7ff82dc700 10 mds.0.cache.ino(0x10000009c8c) auth_pin by 0x564f2ae6b338 on [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141135 ap=4 s=0 n(v0 1=1+0)->n(v0 1=1+0) (ilink lock) (ifile excl w=1) (iversion lock) cr={6366=0-4194304@1} caps={6366=pAsXsFs/-@2},l=6366(-1) | ptrwaiter=1 request=2 lock=2 caps=1 waiter=0 authpin=1 0x564f2ae6b080] now 4
2023-06-08T04:35:31.730+0000 7f7ff82dc700 10 mds.0.locker  got xlock on (ilink xlock x=1 by 0x564f2f0f0800) [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141135 ap=4 s=0 n(v0 1=1+0)->n(v0 1=1+0) (ilink xlock x=1 by 0x564f2f0f0800) (ifile excl w=1) (iversion lock) cr={6366=0-4194304@1} caps={6366=pAsXsFs/-@2},l=6366(-1) | ptrwaiter=1 request=2 lock=3 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.730+0000 7f7ff82dc700  7 mds.0.locker local_wrlock_start  on (iversion lock) on [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141135 ap=4 s=0 n(v0 1=1+0)->n(v0 1=1+0) (ilink xlock x=1 by 0x564f2f0f0800) (ifile excl w=1) (iversion lock) cr={6366=0-4194304@1} caps={6366=pAsXsFs/-@2},l=6366(-1) | ptrwaiter=1 request=2 lock=3 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.730+0000 7f7ff82dc700 10 mds.0.locker  got wrlock on (iversion lock w=1 last_client=6366) [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141135 ap=4 s=0 n(v0 1=1+0)->n(v0 1=1+0) (ilink xlock x=1 by 0x564f2f0f0800) (ifile excl w=1) (iversion lock w=1 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsXsFs/-@2},l=6366(-1) | ptrwaiter=1 request=2 lock=4 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.730+0000 7f7ff82dc700 20 Session check_access path /test/t_mtab~207208

Just after the link request submitted the journal log, it set the ilink Locker state to LOCK_XLOCKDONE but the xlock number is 1, which will prevent ilink Locker's state transiting to the next stable state:

2023-06-08T04:35:31.731+0000 7f7ff82dc700 10 mds.0.server journal_and_reply tracei 0x564f2ae6b080 tracedn 0x564f2b35f400
2023-06-08T04:35:31.731+0000 7f7ff82dc700 10 mds.0.locker set_xlocks_done on (ilink xlock x=1 by 0x564f2f0f0800) [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141138 ap=4 s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0) (ilink xlock x=1 by 0x564f2f0f0800) (ifile excl w=1) (iversion lock w=1 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsXsFs/-@2},l=6366(-1) | ptrwaiter=1 request=2 lock=4 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.731+0000 7f7ff82dc700 10 mds.0.locker set_xlocks_done on (isnap xlock x=1 by 0x564f2f0f0800) [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141138 ap=4 s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0) (ilink xlockdone x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsXsFs/-@2},l=6366(-1) | ptrwaiter=1 request=2 lock=4 caps=1 waiter=0 authpin=1 0x564f2ae6b080]

Usually after the journal log was successfully flushed to the Rados will MDS transit ilink Locker's state to the next stable state, but just before that a unlink request came:

2023-06-08T04:35:31.732+0000 7f7ff82dc700  1 -- [v2:172.21.15.79:6833/2809764197,v1:172.21.15.79:6835/2809764197] <== client.6366 172.21.15.79:0/2942588486 6691 ==== client_request(client.6366:5166 unlink #0x1000000004a/t_mtab~207208 2023-06-08T04:35:31.733597+0000 caller_uid=0, caller_gid=0{0,}) v6 ==== 265+0+0 (crc 0 0 0) 0x564f2e1c7800 con 0x564f2b967c00
2023-06-08T04:35:31.732+0000 7f7ff82dc700  4 mds.0.server handle_client_request client_request(client.6366:5166 unlink #0x1000000004a/t_mtab~207208 2023-06-08T04:35:31.733597+0000 caller_uid=0, caller_gid=0{0,}) v6
2023-06-08T04:35:31.732+0000 7f7ff82dc700 20 mds.0.13 get_session have 0x564f2f07af00 client.6366 172.21.15.79:0/2942588486 state open
...
2023-06-08T04:35:31.732+0000 7f7ff82dc700  4 mds.0.server handle_client_request client_request(client.6366:5166 unlink #0x1000000004a/t_mtab~207208 2023-06-08T04:35:31.733597+0000 caller_uid=0, caller_gid=0{0,}) v6
2023-06-08T04:35:31.732+0000 7f7ff82dc700 20 mds.0.13 get_session have 0x564f2f07af00 client.6366 172.21.15.79:0/2942588486 state open
2023-06-08T04:35:31.732+0000 7f7ff82dc700 15 mds.0.server  oldest_client_tid=5161
2023-06-08T04:35:31.732+0000 7f7ff82dc700  7 mds.0.cache request_start request(client.6366:5166 nref=2 cr=0x564f2e1c7800)
2023-06-08T04:35:31.732+0000 7f7ff82dc700 10 mds.0.locker process_request_cap_release client.6366 pAsXsFsxcrwb on [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141138 ap=4 s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0) (ilink xlockdone x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsLsXsFsxcrwb/-@3},l=6366(-1) | ptrwaiter=1 request=2 lock=4 caps=1 waiter=0 authpin=1 0x564f2ae6b080]

Then the unlink request successfully acquired the xlock for ilink Locker, and set the Locker state to LOCK_XLOCK by increasing the xlock number to 2, which will be [ilink xlock x=2 by 0x564f2b90f800]:

2023-06-08T04:35:31.732+0000 7f7ff82dc700  7 mds.0.locker xlock_start on (ilink xlockdone x=1) on [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141138 ap=5 s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0) (ilink xlockdone x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsXsFsxcrwb/-@3},l=6366(-1) | ptrwaiter=1 request=3 lock=5 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.732+0000 7f7ff82dc700  7 mds.0.locker simple_xlock on (ilink xlockdone x=1) on [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141138 ap=5 s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0) (ilink xlockdone x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsXsFsxcrwb/-@3},l=6366(-1) | ptrwaiter=1 request=3 lock=5 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.732+0000 7f7ff82dc700 10 mds.0.locker  got xlock on (ilink xlock x=2 by 0x564f2b90f800) [inode 0x10000009c8c [2,head] /test/t_mtab~207208 auth v141133 pv141138 ap=5 s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0) (ilink xlock x=2 by 0x564f2b90f800) (ifile excl w=1) (iversion lock w=1 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsXsFsxcrwb/-@3},l=6366(-1) | ptrwaiter=1 request=3 lock=6 caps=1 waiter=0 authpin=1 0x564f2ae6b080]

And just after the unlink request's journal log was submitted, it set the state back to LOCK_XLOCKDONE for ilink's Locker:

2023-06-08T04:35:31.733+0000 7f7ff82dc700 10 mds.0.server journal_and_reply tracei 0 tracedn 0x564f2b503900
2023-06-08T04:35:31.733+0000 7f7ff82dc700 10 mds.0.locker set_xlocks_done on (ilink xlock x=2 by 0x564f2b90f800) [inode 0x10000009c8c [2,head] ~mds0/stray4/10000009c8c auth v141133 pv141139 ap=5 s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink xlock x=2 by 0x564f2b90f800) (ifile excl w=1) (iversion lock w=2 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsXsFsxcrwb/-@3},l=6366(-1) | ptrwaiter=1 request=3 lock=6 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.733+0000 7f7ff82dc700 10 mds.0.locker set_xlocks_done on (isnap xlock x=2 by 0x564f2b90f800) [inode 0x10000009c8c [2,head] ~mds0/stray4/10000009c8c auth v141133 pv141139 ap=5 s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink xlockdone x=2) (ifile excl w=1) (iversion lock w=2 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsXsFsxcrwb/-@3},l=6366(-1) | ptrwaiter=1 request=3 lock=6 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.733+0000 7f7ff82dc700 10 mds.0.locker set_xlocks_done on (dn xlock x=2 by 0x564f2b90f800) [dentry #0x1/test/t_mtab~207208 [2,head] auth NULL (dn xlock x=2 by 0x564f2b90f800) (dversion lock w=2 last_client=6366) pv=141141 v=141116 ap=3 ino=(nil) state=1610612864|unlinking | request=2 lock=3 inodepin=0 dirty=1 authpin=1 0x564f2b503900]
2023-06-08T04:35:31.733+0000 7f7ff82dc700 10 mds.0.locker set_xlocks_done on (dn xlock x=1 by 0x564f2b90f800) [dentry #0x100/stray4/10000009c8c [2,head] auth NULL (dn xlock x=1 by 0x564f2b90f800) (dversion lock w=1 last_client=6366) pv=141139 v=141128 ap=2 ino=(nil) state=1342177281|new | request=1 lock=2 authpin=1 0x564f2cdffb80]
2023-06-08T04:35:31.733+0000 7f7ff82dc700 10 mds.0.server early_reply 0 ((0) Success) client_request(client.6366:5166 unlink #0x1000000004a/t_mtab~207208 2023-06-08T04:35:31.733597+0000 caller_uid=0, caller_gid=0{0,}) v6
2023-06-08T04:35:31.733+0000 7f7ff82dc700 20 mds.0.server set_trace_dist snapid head

But later when the second unlink request came it failed to get the xlock because issued caps need to gather, and then set the state to LOCK_LOCK_XLOCK instead:

2023-06-08T04:35:31.758+0000 7f7ff82dc700  1 -- [v2:172.21.15.79:6833/2809764197,v1:172.21.15.79:6835/2809764197] <== client.6366 172.21.15.79:0/2942588486 6701 ==== client_request(client.6366:5175 unlink #0x1000000004a/t_mtab~ 2023-06-08T04:35:31.759474+0000 caller_uid=0, caller_gid=0{0,}) v6 ==== 259+0+0 (crc 0 0 0) 0x564f2d6f6600 con 0x564f2b967c00
2023-06-08T04:35:31.758+0000 7f7ff22d0700 12 mds.0.cache.dir(0x1000000004a) link_primary_inode [dentry #0x1/test/t_mtab~207208 [2,head] auth NULL (dn xlockdone x=2) (dversion lock w=2 last_client=6366) pv=141141 v=141116 ap=3 ino=(nil) state=1610612864|unlinking | request=2 lock=3 inodepin=0 dirty=1 authpin=1 0x564f2b503900] [inode 0x10000009c8c [2,head] ~mds0/stray4/10000009c8c auth v141133 pv141141 ap=6 s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (iauth excl) (ilink xlockdone x=2) (ifile excl w=2) (ixattr excl) (iversion lock w=2 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@5},l=6366 | ptrwaiter=2 request=3 lock=6 caps=1 waiter=0 authpin=1 0x564f2ae6b080]
...
2023-06-08T04:35:31.758+0000 7f7ff82dc700  4 mds.0.server handle_client_request client_request(client.6366:5175 unlink #0x1000000004a/t_mtab~ 2023-06-08T04:35:31.759474+0000 caller_uid=0, caller_gid=0{0,}) v6
2023-06-08T04:35:31.758+0000 7f7ff82dc700 20 mds.0.13 get_session have 0x564f2f07af00 client.6366 172.21.15.79:0/2942588486 state open
2023-06-08T04:35:31.758+0000 7f7ff82dc700 15 mds.0.server  oldest_client_tid=5164
2023-06-08T04:35:31.758+0000 7f7ff82dc700  7 mds.0.cache request_start request(client.6366:5175 nref=2 cr=0x564f2d6f6600)
2023-06-08T04:35:31.758+0000 7f7ff82dc700 10 mds.0.locker process_request_cap_release client.6366 pAsxXsxFsxcrwb on [inode 0x10000009c8c [2,head] ~mds0/stray4/10000009c8c auth v141133 pv141141 ap=6 DIRTYPARENT s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (iauth excl) (ilink xlockdone x=2) (ifile excl w=2) (ixattr excl) (iversion lock w=2 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsxLsXsxFsxcrwb/pAsxXsxFxwb@6},l=6366 | ptrwaiter=2 request=2 lock=6 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x564f2ae6b080]
...
2023-06-08T04:35:31.759+0000 7f7ff82dc700  7 mds.0.locker xlock_start on (ilink xlockdone x=2) on [inode 0x10000009c8c [2,head] ~mds0/stray4/10000009c8c auth v141133 pv141141 ap=9 DIRTYPARENT s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (iauth excl->sync) (ilink xlockdone x=2) (ifile excl w=2) (ixattr excl->sync) (iversion lock w=2 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsLsXsFsxcrwb/pAsxLsXsxFsxcrwb/-@7},l=6366(-1) | ptrwaiter=2 request=3 lock=7 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.759+0000 7f7ff82dc700  7 mds.0.locker simple_xlock on (ilink xlockdone x=2) on [inode 0x10000009c8c [2,head] ~mds0/stray4/10000009c8c auth v141133 pv141141 ap=9 DIRTYPARENT s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (iauth excl->sync) (ilink xlockdone x=2) (ifile excl w=2) (ixattr excl->sync) (iversion lock w=2 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsLsXsFsxcrwb/pAsxLsXsxFsxcrwb/-@7},l=6366(-1) | ptrwaiter=2 request=3 lock=7 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.759+0000 7f7ff82dc700  7 mds.0.locker get_allowed_caps loner client.6366 allowed=pAsXsFsxcrwb, xlocker allowed=pAsXsFsxcrwb, others allowed=p on [inode 0x10000009c8c [2,head] ~mds0/stray4/10000009c8c auth v141133 pv141141 ap=9 DIRTYPARENT s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (iauth excl->sync) (ilink lock->xlock x=2) (ifile excl w=2) (ixattr excl->sync) (iversion lock w=2 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsLsXsFsxcrwb/pAsxLsXsxFsxcrwb/-@7},l=6366(-1) | ptrwaiter=2 request=3 lock=7 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.759+0000 7f7ff82dc700 20 mds.0.locker  client.6366 pending pAsLsXsFsxcrwb allowed pAsXsFsxcrwb wanted -
2023-06-08T04:35:31.759+0000 7f7ff82dc700  7 mds.0.locker    sending MClientCaps to client.6366 seq 8 new pending pAsXsFsxcrwb was pAsLsXsFsxcrwb
2023-06-08T04:35:31.759+0000 7f7ff82dc700 20 mds.0.cache.ino(0x10000009c8c) encode_cap_message pfile 1 pauth 0 plink 1 pxattr 0 mtime 2023-06-08T04:35:31.716381+0000 ctime 2023-06-08T04:35:31.733597+0000 change_attr 2
2023-06-08T04:35:31.759+0000 7f7ff82dc700 10 mds.0.13 send_message_client_counted client.6366 seq 3911 client_caps(revoke ino 0x10000009c8c 1 seq 8 caps=pAsXsFsxcrwb dirty=- wanted=- follows 0 size 0/4194304 ts 1/18446744073709551615 mtime 2023-06-08T04:35:31.716381+0000 ctime 2023-06-08T04:35:31.733597+0000 change_attr 2) v12
2023-06-08T04:35:31.759+0000 7f7ff82dc700  1 -- [v2:172.21.15.79:6833/2809764197,v1:172.21.15.79:6835/2809764197] --> 172.21.15.79:0/2942588486 -- client_caps(revoke ino 0x10000009c8c 1 seq 8 caps=pAsXsFsxcrwb dirty=- wanted=- follows 0 size 0/4194304 ts 1/18446744073709551615 mtime 2023-06-08T04:35:31.716381+0000 ctime 2023-06-08T04:35:31.733597+0000 change_attr 2) v12 -- 0x564f2e05e700 con 0x564f2b967c00
2023-06-08T04:35:31.759+0000 7f7ff82dc700 10 mds.0.cache.ino(0x10000009c8c) add_waiter tag 2000000000600000 0x564f2a3c4760 !ambig 1 !frozen 1 !freezing 1
2023-06-08T04:35:31.759+0000 7f7ff82dc700 15 mds.0.cache.ino(0x10000009c8c) taking waiter here
2023-06-08T04:35:31.759+0000 7f7ff82dc700 10 mds.0.locker nudge_log (ilink lock->xlock x=2) on [inode 0x10000009c8c [2,head] ~mds0/stray4/10000009c8c auth v141133 pv141141 ap=9 DIRTYPARENT s=0 n(v0 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (iauth excl->sync) (ilink lock->xlock x=2) (ifile excl w=2) (ixattr excl->sync) (iversion lock w=2 last_client=6366) cr={6366=0-4194304@1} caps={6366=pAsXsFsxcrwb/pAsxLsXsxFsxcrwb/-@8},l=6366(-1) | ptrwaiter=2 request=3 lock=7 caps=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x564f2ae6b080]

So later when the journal logs were successfully flushed the MDS would try to finish the xlock, and the first time to decrease the xlock number it succeeded:

2023-06-08T04:35:31.760+0000 7f7ff22d0700 10 MDSIOContextBase::complete: 25C_MDS_unlink_local_finish
2023-06-08T04:35:31.760+0000 7f7ff82dc700  7 mds.0.locker handle_client_caps  on 0x10000009c8c tid 0 follows 0 op update flags 0x0
2023-06-08T04:35:31.760+0000 7f7ff82dc700 20 mds.0.13 get_session have 0x564f2f07af00 client.6366 172.21.15.79:0/2942588486 state open
2023-06-08T04:35:31.760+0000 7f7ff82dc700 10 mds.0.locker  head inode [inode 0x10000009c8c [...2,head] ~mds0/stray4/10000009c8c auth v141138 pv141141 ap=5 snaprealm=0x564f2d268fc0 DIRTYPARENT s=0 nl=2 n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink lock->xlock x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) caps={6366=pAsXsFsxcrwb/pAsLsXsFsxcrwb/-@9},l=6366(-1) | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.760+0000 7f7ff82dc700 10 mds.0.locker  follows 0 retains pAsXsFs dirty - on [inode 0x10000009c8c [...2,head] ~mds0/stray4/10000009c8c auth v141138 pv141141 ap=5 snaprealm=0x564f2d268fc0 DIRTYPARENT s=0 nl=2 n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink lock->xlock x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) caps={6366=pAsXsFsxcrwb/-@9},l=6366(-1) | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.760+0000 7f7ff82dc700 10 mds.0.locker _do_cap_update dirty - issued pAsXsFsxcrwb wanted - on [inode 0x10000009c8c [...2,head] ~mds0/stray4/10000009c8c auth v141138 pv141141 ap=5 snaprealm=0x564f2d268fc0 DIRTYPARENT s=0 nl=2 n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink lock->xlock x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) caps={6366=pAsXsFsxcrwb/-@9},l=6366(-1) | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.760+0000 7f7ff82dc700 20 mds.0.locker inode is file
2023-06-08T04:35:31.760+0000 7f7ff82dc700 20 mds.0.locker client has write caps; m->get_max_size=4194304; old_max=4194304
2023-06-08T04:35:31.760+0000 7f7ff82dc700 10 mds.0.locker eval 3648 [inode 0x10000009c8c [...2,head] ~mds0/stray4/10000009c8c auth v141138 pv141141 ap=5 snaprealm=0x564f2d268fc0 DIRTYPARENT s=0 nl=2 n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink lock->xlock x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) caps={6366=pAsXsFsxcrwb/-@9},l=6366(-1) | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.760+0000 7f7ff82dc700 10 mds.0.locker eval want loner: client.-1 but failed to set it
2023-06-08T04:35:31.760+0000 7f7ff82dc700  7 mds.0.locker file_eval wanted= loner_wanted= other_wanted=  filelock=(ifile excl w=1) on [inode 0x10000009c8c [...2,head] ~mds0/stray4/10000009c8c auth v141138 pv141141 ap=5 snaprealm=0x564f2d268fc0 DIRTYPARENT s=0 nl=2 n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink lock->xlock x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) caps={6366=pAsXsFsxcrwb/-@9},l=6366(-1) | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.760+0000 7f7ff82dc700 20 mds.0.locker  is excl
2023-06-08T04:35:31.760+0000 7f7ff82dc700  7 mds.0.locker file_eval loner_issued=sxcrwb other_issued= xlocker_issued=
2023-06-08T04:35:31.760+0000 7f7ff82dc700 10 mds.0.locker simple_eval (iauth sync) on [inode 0x10000009c8c [...2,head] ~mds0/stray4/10000009c8c auth v141138 pv141141 ap=5 snaprealm=0x564f2d268fc0 DIRTYPARENT s=0 nl=2 n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink lock->xlock x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) caps={6366=pAsXsFsxcrwb/-@9},l=6366(-1) | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.760+0000 7f7ff82dc700 10 mds.0.locker eval_gather (ilink lock->xlock x=1) on [inode 0x10000009c8c [...2,head] ~mds0/stray4/10000009c8c auth v141138 pv141141 ap=5 snaprealm=0x564f2d268fc0 DIRTYPARENT s=0 nl=2 n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink lock->xlock x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) caps={6366=pAsXsFsxcrwb/-@9},l=6366(-1) | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.760+0000 7f7ff82dc700 10 mds.0.locker  next state is prexlock issued/allows loner / xlocker / other /
2023-06-08T04:35:31.760+0000 7f7ff82dc700  7 mds.0.locker eval_gather finished gather on (ilink lock->xlock x=1) on [inode 0x10000009c8c [...2,head] ~mds0/stray4/10000009c8c auth v141138 pv141141 ap=5 snaprealm=0x564f2d268fc0 DIRTYPARENT s=0 nl=2 n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink lock->xlock x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) caps={6366=pAsXsFsxcrwb/-@9},l=6366(-1) | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.760+0000 7f7ff82dc700 10 mds.0.locker   trying to drop loner
2023-06-08T04:35:31.761+0000 7f7ff82dc700 10 mds.0.locker simple_eval (ixattr sync) on [inode 0x10000009c8c [...2,head] ~mds0/stray4/10000009c8c auth v141138 pv141141 ap=5 snaprealm=0x564f2d268fc0 DIRTYPARENT s=0 nl=2 n(v0 rc2023-06-08T04:35:31.723602+0000 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink prexlock x=1) (ifile excl w=1) (iversion lock w=1 last_client=6366) caps={6366=pAsXsFsxcrwb/-@9},l=6366(-1) | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x564f2ae6b080]

But when decrease it to 0 the second time, it failed:

2023-06-08T04:35:31.761+0000 7f7ff22d0700 10 mds.0.locker xlock_finish on (ilink prexlock x=1) [inode 0x10000009c8c [...2,head] ~mds0/stray4/10000009c8c auth v141139 pv141141 ap=5 snaprealm=0x564f2d268fc0 DIRTYPARENT s=0 n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0)->n(v0 rc2023-06-08T04:35:31.733597+0000 1=1+0) (ilink prexlock x=1) (ifile excl w=1) (iversion lock) caps={6366=pAsXsFsxcrwb/-@9},l=6366(-1) | ptrwaiter=1 request=2 lock=3 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x564f2ae6b080]
2023-06-08T04:35:31.762+0000 7f7ffbae3700  1 -- [v2:172.21.15.79:6833/2809764197,v1:172.21.15.79:6835/2809764197] <== osd.2 v2:172.21.15.79:6808/4041064949 88233 ==== osd_op_reply(740970 500.00000000 [writefull 0~90] v40'1451 uv1451 ondisk = 0) v8 ==== 156+0+0 (crc 0 0 0) 0x564f2d57fb00 con 0x564f2a1f6c00
2023-06-08T04:35:31.787+0000 7f7ff22d0700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-4313-ga02638f8/rpm/el8/BUILD/ceph-18.0.0-4313-ga02638f8/src/mds/SimpleLock.h: In function 'void SimpleLock::put_xlock()' thread 7f7ff22d0700 time 2023-06-08T04:35:31.762850+0000
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-4313-ga02638f8/rpm/el8/BUILD/ceph-18.0.0-4313-ga02638f8/src/mds/SimpleLock.h: 417: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())

 ceph version 18.0.0-4313-ga02638f8 (a02638f808b8fe11bccdd01222c0e6e2eb71120b) reef (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7f8003b94dbf]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9f85) [0x7f8003b94f85]
 3: (SimpleLock::put_xlock()+0x332) [0x564f27a5d432]
 4: (Locker::xlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*, bool*)+0x8c) [0x564f27a4534c]
 5: (Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x407) [0x564f27a4e187]
 6: (Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x5f) [0x564f27a529ff]
 7: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x497) [0x564f278a4bd7]
 8: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x350) [0x564f278a5a80]
 9: (Server::_unlink_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*, unsigned long)+0x2ea) [0x564f278c3c3a]
 10: (MDSContext::complete(int)+0x5f) [0x564f27b8c18f]
 11: (MDSIOContextBase::complete(int)+0x524) [0x564f27b8c914]
 12: (MDSLogContextBase::complete(int)+0x5b) [0x564f27b8cccb]
 13: (Finisher::finisher_thread_entry()+0x18d) [0x7f8003c387cd]
 14: /lib64/libpthread.so.0(+0x81cf) [0x7f800293d1cf]
 15: clone()

Because currently the put_xlock() do not allow putting the Locker in LOCK_PREXLOCK state:

416   void put_xlock() {
417     ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE ||
418            state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK ||
419            state == LOCK_LOCK  || /* if we are a leader of a peer */
420            state == LOCK_PREXLOCK ||
421            is_locallock());
422     --more()->num_xlock;
423     parent->put(MDSCacheObject::PIN_LOCK);
424     if (more()->num_xlock == 0) {
425       more()->xlock_by.reset();
426       more()->xlock_by_client = -1;
427       try_clear_more();
428     }
429   }

IMO we should allow it here.

Actions #19

Updated by Xiubo Li 10 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 52111
Actions #20

Updated by Xiubo Li 10 months ago

There is another different case also causing the same ceph_assert failure:

http://qa-proxy.ceph.com/teuthology/xiubli-2023-06-08_03:32:08-fs:functional-wip-lxb-fscrypt-20230607-0901-distro-default-smithi/7298110/

The ilink locker state was LOCK_SYNC instead:

    -9> 2023-06-08T04:18:14.166+0000 7f1b6a5ea700 10 mds.0.locker xlock_finish on (ilink sync x=1) [inode 0x1000000a2f3 [...2,head] ~mds0/stray0/1000000a2f3 auth v119629 ap=3 snaprealm=0x562bd7f7d440 DIRTYPARENT s=0 nl=4 n(v0 rc2023-06-08T04:18:14.142623+0000 1=1+0) (iauth excl) (ilink sync x=1) (ifile excl) (ixattr excl) (iversion lock) cr={6231=0-4194304@1} caps={6231=pAsxXsxFsxcrwb/pAsxXsxFxwb@14},l=6231 | ptrwaiter=0 request=2 lock=2 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x562bd769a580]
    -8> 2023-06-08T04:18:14.166+0000 7f1b735fc700  1 -- [v2:172.21.15.16:6832/3183010605,v1:172.21.15.16:6833/3183010605] <== osd.0 v2:172.21.15.16:6800/704029040 98092 ==== osd_op_reply(784588 1000000a146.00000172 [delete] v37'11475 uv11255 ondisk = -2 ((2) No such file or directory)) v8 ==== 164+0+0 (crc 0 0 0) 0x562bd73b4fc0 con 0x562bd7408000
    -7> 2023-06-08T04:18:14.166+0000 7f1b735fc700  1 -- [v2:172.21.15.16:6832/3183010605,v1:172.21.15.16:6833/3183010605] <== osd.0 v2:172.21.15.16:6800/704029040 98093 ==== osd_op_reply(784589 1000000a146.00000173 [delete] v37'11476 uv11255 ondisk = -2 ((2) No such file or directory)) v8 ==== 164+0+0 (crc 0 0 0) 0x562bd73b4fc0 con 0x562bd7408000
    -6> 2023-06-08T04:18:14.167+0000 7f1b6b5ec700  1 -- [v2:172.21.15.16:6832/3183010605,v1:172.21.15.16:6833/3183010605] --> [v2:172.21.15.16:6800/704029040,v1:172.21.15.16:6801/704029040] -- osd_op(unknown.0.12:784601 6.18 6:19beb7ff:::1000000a146.0000017e:head [delete] snapc 1=[] ondisk+write+known_if_redirected+full_force+supports_pool_eio e37) v8 -- 0x562bdac01500 con 0x562bd7408000
    -5> 2023-06-08T04:18:14.167+0000 7f1b6b5ec700  1 -- [v2:172.21.15.16:6832/3183010605,v1:172.21.15.16:6833/3183010605] --> [v2:172.21.15.169:6824/3479195367,v1:172.21.15.169:6825/3479195367] -- osd_op(unknown.0.12:784602 6.30 6:0eba4749:::1000000a146.0000017f:head [delete] snapc 1=[] ondisk+write+known_if_redirected+full_force+supports_pool_eio e37) v8 -- 0x562bd8495880 con 0x562bd73b6c00
    -4> 2023-06-08T04:18:14.167+0000 7f1b6b5ec700  1 -- [v2:172.21.15.16:6832/3183010605,v1:172.21.15.16:6833/3183010605] --> [v2:172.21.15.169:6808/2490026649,v1:172.21.15.169:6809/2490026649] -- osd_op(unknown.0.12:784603 6.2e 6:75701c74:::1000000a146.00000180:head [delete] snapc 1=[] ondisk+write+known_if_redirected+full_force+supports_pool_eio e37) v8 -- 0x562bda008380 con 0x562bd73b6800
    -3> 2023-06-08T04:18:14.168+0000 7f1b6a5ea700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-4313-ga02638f8/rpm/el8/BUILD/ceph-18.0.0-4313-ga02638f8/src/mds/SimpleLock.h: In function 'void SimpleLock::put_xlock()' thread 7f1b6a5ea700 time 2023-06-08T04:18:14.167263+0000
/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-4313-ga02638f8/rpm/el8/BUILD/ceph-18.0.0-4313-ga02638f8/src/mds/SimpleLock.h: 417: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())

 ceph version 18.0.0-4313-ga02638f8 (a02638f808b8fe11bccdd01222c0e6e2eb71120b) reef (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x135) [0x7f1b80aa9dbf]
 2: /usr/lib64/ceph/libceph-common.so.2(+0x2a9f85) [0x7f1b80aa9f85]
 3: (SimpleLock::put_xlock()+0x332) [0x562bd4e49432]
 4: (Locker::xlock_finish(std::_Rb_tree_const_iterator<MutationImpl::LockOp> const&, MutationImpl*, bool*)+0x8c) [0x562bd4e3134c]
 5: (Locker::_drop_locks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*, bool)+0x407) [0x562bd4e3a187]
 6: (Locker::drop_non_rdlocks(MutationImpl*, std::set<CInode*, std::less<CInode*>, std::allocator<CInode*> >*)+0x5f) [0x562bd4e3e9ff]
 7: (Server::reply_client_request(boost::intrusive_ptr<MDRequestImpl>&, boost::intrusive_ptr<MClientReply> const&)+0x497) [0x562bd4c90bd7]
 8: (Server::respond_to_request(boost::intrusive_ptr<MDRequestImpl>&, int)+0x350) [0x562bd4c91a80]
 9: (Server::_unlink_local_finish(boost::intrusive_ptr<MDRequestImpl>&, CDentry*, CDentry*, unsigned long)+0x2ea) [0x562bd4cafc3a]
 10: (MDSContext::complete(int)+0x5f) [0x562bd4f7818f]
 11: (MDSIOContextBase::complete(int)+0x524) [0x562bd4f78914]
 12: (MDSLogContextBase::complete(int)+0x5b) [0x562bd4f78ccb]
 13: (Finisher::finisher_thread_entry()+0x18d) [0x7f1b80b4d7cd]
 14: /lib64/libpthread.so.0(+0x81ca) [0x7f1b7ee081ca]
 15: clone()

This could happen when dropping the locks in some cases:

  -297> 2023-06-08T04:18:14.156+0000 7f1b705f6700  7 mds.0.server dispatch_client_request client_request(client.6231:307 unlink #0x10000000001/084.target.link.7.11 2023-06-08T04:18:14.152623+0000 caller_uid=0, caller_gid=0{0,}) v6
  -296> 2023-06-08T04:18:14.156+0000 7f1b705f6700 10 mds.0.server rdlock_path_xlock_dentry request(client.6231:307 nref=4 cr=0x562bdb5dcf00) #0x10000000001/084.target.link.7.11
  -295> 2023-06-08T04:18:14.156+0000 7f1b705f6700 20 mds.0.server wait_for_pending_reintegrate dn [dentry #0x1/scratch/084.target.link.7.11 [2,head] auth REMOTE(reg) (dn xlock x=1 by 0x562bda70c800) (dversion lock w=1 last_client=6231) v=119608 ap=2 ino=0x1000000a2f3 state=1610613120|unlinking|reintegrating | request=1 lock=2 purging=1 dirty=1 waiter=1 authpin=1 0x562bdb7fdb80]
  -294> 2023-06-08T04:18:14.156+0000 7f1b705f6700 10 mds.0.locker cancel_locking (ilink prexlock x=1) on request(client.6231:307 nref=4 cr=0x562bdb5dcf00)
  -293> 2023-06-08T04:18:14.156+0000 7f1b705f6700 10 mds.0.cache.ino(0x1000000a2f3) auth_unpin by 0x562bd769a838 on [inode 0x1000000a2f3 [...2,head] ~mds0/stray0/1000000a2f3 auth v119625 pv119629 ap=4 snaprealm=0x562bd7f7d440 DIRTYPARENT s=0 nl=5 n(v0 rc2023-06-08T04:18:14.135623+0000 1=1+0)->n(v0 rc2023-06-08T04:18:14.142623+0000 1=1+0) (iauth excl) (ilink excl x=1) (ifile excl w=1) (ixattr excl) (iversion lock w=1 last_client=6231) cr={6231=0-4194304@1} caps={6231=pAsxXsxFsxcrwb/pAsxXsxFxwb@14},l=6231 | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x562bd769a580] now 4
  -292> 2023-06-08T04:18:14.156+0000 7f1b705f6700 15 mds.0.cache.dir(0x10000000001) adjust_nested_auth_pins -1 on [dir 0x10000000001 /scratch/ [2,head] auth pv=119631 v=119627 cv=1/1 ap=4+8 state=1610874881|complete f(v0 m2023-06-08T04:18:14.135623+0000 5=5+0)->f(v0 m2023-06-08T04:18:14.142623+0000 4=4+0) n(v1 rc2023-06-08T04:18:14.135623+0000 1=1+0)->n(v1 rc2023-06-08T04:18:14.142623+0000) hs=5+124,ss=0+0 dirty=63 | child=1 dirty=1 authpin=1 0x562bd731a400] by 0x562bd769a838 count now 4/8
  -291> 2023-06-08T04:18:14.156+0000 7f1b705f6700 10 mds.0.locker simple_eval (ilink excl x=1) on [inode 0x1000000a2f3 [...2,head] ~mds0/stray0/1000000a2f3 auth v119625 pv119629 ap=4 snaprealm=0x562bd7f7d440 DIRTYPARENT s=0 nl=5 n(v0 rc2023-06-08T04:18:14.135623+0000 1=1+0)->n(v0 rc2023-06-08T04:18:14.142623+0000 1=1+0) (iauth excl) (ilink excl x=1) (ifile excl w=1) (ixattr excl) (iversion lock w=1 last_client=6231) cr={6231=0-4194304@1} caps={6231=pAsxXsxFsxcrwb/pAsxXsxFxwb@14},l=6231 | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x562bd769a580]
  -290> 2023-06-08T04:18:14.156+0000 7f1b705f6700  7 mds.0.locker simple_eval stable, syncing (ilink excl x=1) on [inode 0x1000000a2f3 [...2,head] ~mds0/stray0/1000000a2f3 auth v119625 pv119629 ap=4 snaprealm=0x562bd7f7d440 DIRTYPARENT s=0 nl=5 n(v0 rc2023-06-08T04:18:14.135623+0000 1=1+0)->n(v0 rc2023-06-08T04:18:14.142623+0000 1=1+0) (iauth excl) (ilink excl x=1) (ifile excl w=1) (ixattr excl) (iversion lock w=1 last_client=6231) cr={6231=0-4194304@1} caps={6231=pAsxXsxFsxcrwb/pAsxXsxFxwb@14},l=6231 | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x562bd769a580]
  -289> 2023-06-08T04:18:14.156+0000 7f1b705f6700  7 mds.0.locker simple_sync on (ilink excl x=1) on [inode 0x1000000a2f3 [...2,head] ~mds0/stray0/1000000a2f3 auth v119625 pv119629 ap=4 snaprealm=0x562bd7f7d440 DIRTYPARENT s=0 nl=5 n(v0 rc2023-06-08T04:18:14.135623+0000 1=1+0)->n(v0 rc2023-06-08T04:18:14.142623+0000 1=1+0) (iauth excl) (ilink excl x=1) (ifile excl w=1) (ixattr excl) (iversion lock w=1 last_client=6231) cr={6231=0-4194304@1} caps={6231=pAsxXsxFsxcrwb/pAsxXsxFxwb@14},l=6231 | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x562bd769a580]
  -288> 2023-06-08T04:18:14.156+0000 7f1b705f6700 10 mds.0.locker xlock_finish on (isnap xlock x=2 by 0x562bda70c800) [inode 0x1000000a2f3 [...2,head] ~mds0/stray0/1000000a2f3 auth v119625 pv119629 ap=4 snaprealm=0x562bd7f7d440 DIRTYPARENT s=0 nl=5 n(v0 rc2023-06-08T04:18:14.135623+0000 1=1+0)->n(v0 rc2023-06-08T04:18:14.142623+0000 1=1+0) (iauth excl) (ilink sync x=1) (ifile excl w=1) (ixattr excl) (iversion lock w=1 last_client=6231) cr={6231=0-4194304@1} caps={6231=pAsxXsxFsxcrwb/pAsxXsxFxwb@14},l=6231 | ptrwaiter=1 request=2 lock=5 caps=1 remoteparent=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x562bd769a580]
...

This should be fix too.

Actions #21

Updated by Patrick Donnelly 9 months ago

  • Has duplicate Bug #61967: mds: "SimpleLock.h: 417: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock())" added
Actions #22

Updated by Venky Shankar 8 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Target version set to v19.0.0
  • Source set to Development
  • Backport set to reef,quincy,pacific
Actions #23

Updated by Backport Bot 8 months ago

  • Copied to Backport #62522: quincy: src/mds/SimpleLock.h: 528: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock()) added
Actions #24

Updated by Backport Bot 8 months ago

  • Copied to Backport #62523: pacific: src/mds/SimpleLock.h: 528: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock()) added
Actions #25

Updated by Backport Bot 8 months ago

  • Copied to Backport #62524: reef: src/mds/SimpleLock.h: 528: FAILED ceph_assert(state == LOCK_XLOCK || state == LOCK_XLOCKDONE || state == LOCK_XLOCKSNAP || state == LOCK_LOCK_XLOCK || state == LOCK_LOCK || is_locallock()) added
Actions #26

Updated by Backport Bot 8 months ago

  • Tags set to backport_processed
Actions #27

Updated by Xiubo Li 5 months ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF