Project

General

Profile

Actions

Bug #61818

open

mds: deadlock between unlink and linkmerge

Added by Xiubo Li 10 months ago. Updated 8 months ago.

Status:
Pending Backport
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
pacific, quincy, reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

https://pulpito.ceph.com/xiubli-2023-06-26_02:38:43-fs:functional-wip-lxb-xlock-20230619-0716-distro-default-smithi/7316098/

When the unlink request was tried first time it will set the dentry's state to STATE_UNLINKING and then wait locks and then goto sleep.

Then a previous request finishes just after journal log was flushed and sending out the safe reply it will try to trigger a linkmerge by setting the INTEGRATING state, and the linkmerge reqeust, which is internal rename request, detectes that the dentry has been set to STATE_UNLINKING, so the linkmerge will wait the unlink to finish.

And then the unlink request is retried for the second time and but detects that the dentry is under linkmerging, so it will wait the linkmerge to finish.

Then caused the deadlock.


Related issues 6 (1 open5 closed)

Related to CephFS - Bug #62096: mds: infinite rename recursion on itselfDuplicatePatrick Donnelly

Actions
Related to CephFS - Bug #58340: mds: fsstress.sh hangs with multimds (deadlock between unlink and reintegrate straydn(rename))ResolvedXiubo Li

Actions
Related to CephFS - Bug #56695: [RHEL stock] pjd test failures(a bug that need to wait the unlink to finish)ResolvedXiubo Li

Actions
Copied to CephFS - Backport #62858: quincy: mds: deadlock between unlink and linkmergeIn ProgressXiubo LiActions
Copied to CephFS - Backport #62859: pacific: mds: deadlock between unlink and linkmergeResolvedXiubo LiActions
Copied to CephFS - Backport #62860: reef: mds: deadlock between unlink and linkmergeResolvedXiubo LiActions
Actions #1

Updated by Xiubo Li 10 months ago

  • Description updated (diff)
Actions #2

Updated by Xiubo Li 10 months ago

Xiubo Li wrote:

https://pulpito.ceph.com/xiubli-2023-06-26_02:38:43-fs:functional-wip-lxb-xlock-20230619-0716-distro-default-smithi/7316098/

When the unlink request was tried first time it will set the dentry's state to STATE_UNLINKING and then wait locks and then goto sleep.

Then a previous request finishes just after journal log was flushed and sending out the safe reply it will try to trigger a linkmerge by setting the INTEGRATING state, and the linkmerge reqeust, which is internal rename request, detectes that the dentry has been set to STATE_UNLINKING, so the linkmerge will wait the unlink to finish.

And then the unlink request is retried for the second time and but detects that the dentry is under linkmerging, so it will wait the linkmerge to finish.

Then caused the deadlock.

The unlink request came and dispatched:

2023-06-26T03:06:41.257+0000 7faa7d764640  1 -- [v2:172.21.15.81:6836/1044602586,v1:172.21.15.81:6837/1044602586] <== client.5652 172.21.15.81:0/443081920 26701 ==== client_request(client.5652:13729 unlink #0x100000096da/f124 2023-06-26T03:06:41.257367+0000 caller_uid=0, caller_gid=0{0,}) v6 ==== 256+0+0 (crc 0 0 0) 0x558018a47800 con 0x55801add3400
...
2023-06-26T03:06:41.257+0000 7faa77758640 10 MDSIOContextBase::complete: 26C_Locker_FileUpdate_finish
2023-06-26T03:06:41.261+0000 7faa7d764640  4 mds.0.server handle_client_request client_request(client.5652:13729 unlink #0x100000096da/f124 2023-06-26T03:06:41.257367+0000 caller_uid=0, caller_gid=0{0,}) v6
2023-06-26T03:06:41.261+0000 7faa7d764640 20 mds.0.11 get_session have 0x558016efa300 client.5652 172.21.15.81:0/443081920 state open
2023-06-26T03:06:41.261+0000 7faa7d764640 15 mds.0.server  oldest_client_tid=13703
2023-06-26T03:06:41.261+0000 7faa7d764640  7 mds.0.cache request_start request(client.5652:13729 nref=2 cr=0x558018a47800)
2023-06-26T03:06:41.261+0000 7faa7d764640 10 mds.0.locker process_request_cap_release client.5652 pAsXsFsxcrwb on [inode 0x10000009700 [2,head] ~mds0/stray3/10000009700 auth v2 pv78966 ap=5 DIRTYPARENT s=0 n(v0 1=1+0)->n(v0 rc2023-06-26T03:06:41.242613+0000 1=1+0) (ilink xlockdone x=2) (ifile excl w=1) (iversion lock w=2 last_client=5652) cr={5652=0-4194304@1} caps={5652=pAsLsXsFsxcrwb/-@5},l=5652(-1) | ptrwaiter=1 request=2 lock=6 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x558019881080]
...
2023-06-26T03:06:41.261+0000 7faa7d764640  7 mds.0.server dispatch_client_request client_request(client.5652:13729 unlink #0x100000096da/f124 2023-06-26T03:06:41.257367+0000 caller_uid=0, caller_gid=0{0,}) v6
2023-06-26T03:06:41.261+0000 7faa7d764640 10 mds.0.server rdlock_path_xlock_dentry request(client.5652:13729 nref=2 cr=0x558018a47800) #0x100000096da/f124
2023-06-26T03:06:41.261+0000 7faa7d764640  7 mds.0.cache traverse: opening base ino 0x100000096da snap head
2023-06-26T03:06:41.261+0000 7faa7d764640 10 mds.0.locker found lock cache for unlink on [inode 0x100000096da [...2,head] /fsstress.85886.3/p3/d0/d69/d96/def/de9/dd0/d10a/d109/ auth v78461 pv78467 ap=3 DIRTYPARENT f(v0 m2023-06-26T03:06:40.967583+0000 1=0+1)->f(v0 m2023-06-26T03:06:41.234705+0000 2=1+1) n(v0 rc2023-06-26T03:06:40.967583+0000 2=0+2)->n(v0 rc2023-06-26T03:06:41.234705+0000 2=0+2) (iauth sync r=1) (isnap sync r=3) (inest lock w=3) (ipolicy sync r=1) (ifile excl w=3) (iversion lock w=1 last_client=5652) caps={5652=pAsLsXsFsx/-@7},l=5652 | request=0 lock=6 dirfrag=1 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55801adbe100]
2023-06-26T03:06:41.261+0000 7faa7d764640 12 mds.0.cache traverse: path seg depth 0 'f124' snapid head
2023-06-26T03:06:41.261+0000 7faa7d764640 20 mds.0.cache.dir(0x100000096da) lookup (f124, 'head')
2023-06-26T03:06:41.261+0000 7faa7d764640 20 mds.0.cache.dir(0x100000096da)   hit -> (f124,head)
...

But then went to sleep when acquiring the locker, and have already set the STATE_UNLINKING state for dentry #0x100000096da/f124

2023-06-26T03:06:41.261+0000 7faa7d764640  7 mds.0.locker xlock_start on (ilink xlockdone x=2) on [inode 0x10000009700 [2,head] ~mds0/stray3/10000009700 auth v2 pv78966 ap=6 DIRTYPARENT s=0 n(v0 1=1+0)->n(v0 rc2023-06-26T03:06:41.242613+0000 1=1+0) (ilink xlockdone x=2) (ifile excl w=1) (iversion lock w=2 last_client=5652) cr={5652=0-4194304@1} caps={5652=pAsLsXsFsxcrwb/-@5},l=5652(-1) | ptrwaiter=1 request=3 lock=7 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x558019881080]
2023-06-26T03:06:41.261+0000 7faa7d764640  7 mds.0.locker simple_xlock on (ilink xlockdone x=2) on [inode 0x10000009700 [2,head] ~mds0/stray3/10000009700 auth v2 pv78966 ap=6 DIRTYPARENT s=0 n(v0 1=1+0)->n(v0 rc2023-06-26T03:06:41.242613+0000 1=1+0) (ilink xlockdone x=2) (ifile excl w=1) (iversion lock w=2 last_client=5652) cr={5652=0-4194304@1} caps={5652=pAsLsXsFsxcrwb/-@5},l=5652(-1) | ptrwaiter=1 request=3 lock=7 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x558019881080]
2023-06-26T03:06:41.261+0000 7faa7d764640  7 mds.0.locker get_allowed_caps loner client.5652 allowed=pAsXsFsxcrwb, xlocker allowed=pAsXsFsxcrwb, others allowed=pAsXs on [inode 0x10000009700 [2,head] ~mds0/stray3/10000009700 auth v2 pv78966 ap=6 DIRTYPARENT s=0 n(v0 1=1+0)->n(v0 rc2023-06-26T03:06:41.242613+0000 1=1+0) (ilink lock->xlock x=2) (ifile excl w=1) (iversion lock w=2 last_client=5652) cr={5652=0-4194304@1} caps={5652=pAsLsXsFsxcrwb/-@5},l=5652(-1) | ptrwaiter=1 request=3 lock=7 caps=1 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x558019881080]
2023-06-26T03:06:41.261+0000 7faa7d764640 20 mds.0.locker  client.5652 pending pAsLsXsFsxcrwb allowed pAsXsFsxcrwb wanted -
2023-06-26T03:06:41.261+0000 7faa7d764640  7 mds.0.locker    sending MClientCaps to client.5652 seq 6 new pending pAsXsFsxcrwb was pAsLsXsFsxcrwb
2023-06-26T03:06:41.261+0000 7faa7d764640 20 mds.0.cache.ino(0x10000009700) encode_cap_message pfile 1 pauth 0 plink 1 pxattr 0 mtime 2023-06-26T03:06:41.225116+0000 ctime 2023-06-26T03:06:41.242613+0000 change_attr 2
2023-06-26T03:06:41.261+0000 7faa7d764640 10 mds.0.11 send_message_client_counted client.5652 seq 17064 client_caps(revoke ino 0x10000009700 1 seq 6 caps=pAsXsFsxcrwb dirty=- wanted=- follows 0 size 0/0 ts 1/18446744073709551615 mtime 2023-06-26T03:06:41.225116+0000 ctime 2023-06-26T03:06:41.242613+0000 change_attr 2) v12
2023-06-26T03:06:41.261+0000 7faa7d764640  1 -- [v2:172.21.15.81:6836/1044602586,v1:172.21.15.81:6837/1044602586] --> 172.21.15.81:0/443081920 -- client_caps(revoke ino 0x10000009700 1 seq 6 caps=pAsXsFsxcrwb dirty=- wanted=- follows 0 size 0/0 ts 1/18446744073709551615 mtime 2023-06-26T03:06:41.225116+0000 ctime 2023-06-26T03:06:41.242613+0000 change_attr 2) v12 -- 0x558019dd7c00 con 0x55801add3400
2023-06-26T03:06:41.261+0000 7faa7d764640 10 mds.0.cache.ino(0x10000009700) add_waiter tag 2000000000600000 0x55801b3086a0 !ambig 1 !frozen 1 !freezing 1
2023-06-26T03:06:41.261+0000 7faa7d764640 15 mds.0.cache.ino(0x10000009700) taking waiter here
2023-06-26T03:06:41.261+0000 7faa7d764640 10 mds.0.locker nudge_log (ilink lock->xlock x=2) on [inode 0x10000009700 [2,head] ~mds0/stray3/10000009700 auth v2 pv78966 ap=6 DIRTYPARENT s=0 n(v0 1=1+0)->n(v0 rc2023-06-26T03:06:41.242613+0000 1=1+0) (ilink lock->xlock x=2) (ifile excl w=1) (iversion lock w=2 last_client=5652) cr={5652=0-4194304@1} caps={5652=pAsXsFsxcrwb/pAsLsXsFsxcrwb/-@6},l=5652(-1) | ptrwaiter=1 request=3 lock=7 caps=1 dirtyparent=1 dirty=1 waiter=1 authpin=1 0x558019881080]

And then the previous link request journal log was successfully flushed and sent out the safe reply to client and then trigger the linkmerge:

2023-06-26T03:06:41.261+0000 7faa77758640 20 mds.0.bal hit_dir 1 pop is 2.99945, frag * size 0 [pop IRD:[C 0.00e+00] IWR:[C 3.00e+00] RDR:[C 0.00e+00] FET:[C 0.00e+00] STR:[C 0.00e+00] *LOAD:6.0]
2023-06-26T03:06:41.261+0000 7faa77758640 20 mds.0.bal hit_dir 1 pop is 2.9196, frag * size 2 [pop IRD:[C 0.00e+00] IWR:[C 2.92e+00] RDR:[C 0.00e+00] FET:[C 0.00e+00] STR:[C 0.00e+00] *LOAD:5.8]
2023-06-26T03:06:41.261+0000 7faa77758640  7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.5652:13708 link #0x100000096da/f124 #0x10000009700 2023-06-26T03:06:41.234705+0000 caller_uid=0, caller_gid=0{0,}) v6
...
2023-06-26T03:06:41.261+0000 7faa77758640 10 mds.0.11 send_message_client client.5652 172.21.15.81:0/443081920 client_reply(???:13708 = 0 (0) Success safe) v1
2023-06-26T03:06:41.261+0000 7faa77758640  1 -- [v2:172.21.15.81:6836/1044602586,v1:172.21.15.81:6837/1044602586] --> 172.21.15.81:0/443081920 -- client_reply(???:13708 = 0 (0) Success safe) v1 -- 0x5580183be700 con 0x55801add3400
2023-06-26T03:06:41.261+0000 7faa77758640  7 mds.0.cache request_finish request(client.5652:13708 nref=2 cr=0x55801c5a7b00)
2023-06-26T03:06:41.261+0000 7faa77758640 15 mds.0.cache request_cleanup request(client.5652:13708 nref=2 cr=0x55801c5a7b00)
...
2023-06-26T03:06:41.261+0000 7faa77758640 10 mds.0.cache.strays eval_remote [dentry #0x1/fsstress.85886.3/p3/d0/d69/d96/def/de9/dd0/d10a/d109/f124 [2,head] auth REMOTE(reg) (dn xlock x=1 by 0x558018667800) (dversion lock w=1 last_client=5652) v=8 ap=2 ino=0x10000009700 state=1610612864|unlinking | request=1 lock=2 dirty=1 authpin=1 0x5580194a7680]
2023-06-26T03:06:41.261+0000 7faa77758640 20 mds.0.cache.strays _eval_stray_remote [dentry #0x100/stray3/10000009700 [2,head] auth NULL (dn xlockdone x=1) (dversion lock w=1 last_client=5652) pv=78966 v=78965 ap=2 ino=(nil) state=1342177281|new | request=1 lock=2 authpin=1 0x55801a48b900]
2023-06-26T03:06:41.261+0000 7faa77758640 10 mds.0.cache.strays reintegrate_stray [dentry #0x100/stray3/10000009700 [2,head] auth NULL (dn xlockdone x=1) (dversion lock w=1 last_client=5652) pv=78966 v=78965 ap=2 ino=(nil) state=1342177281|new | request=1 lock=2 authpin=1 0x55801a48b900] to [dentry #0x1/fsstress.85886.3/p3/d0/d69/d96/def/de9/dd0/d10a/d109/f124 [2,head] auth REMOTE(reg) (dn xlock x=1 by 0x558018667800) (dversion lock w=1 last_client=5652) v=8 ap=2 ino=0x10000009700 state=1610612864|unlinking | request=1 lock=2 dirty=1 authpin=1 0x5580194a7680]
2023-06-26T03:06:41.261+0000 7faa77758640  1 -- [v2:172.21.15.81:6836/1044602586,v1:172.21.15.81:6837/1044602586] send_to--> mds [v2:172.21.15.81:6836/1044602586,v1:172.21.15.81:6837/1044602586] -- client_request(unknown.0:166 rename #0x100000096da/f124 #0x603/10000009700 caller_uid=0, caller_gid=0{}) v6 -- ?+0 0x55801b0b2900
2023-06-26T03:06:41.261+0000 7faa77758640  1 -- [v2:172.21.15.81:6836/1044602586,v1:172.21.15.81:6837/1044602586] --> [v2:172.21.15.81:6836/1044602586,v1:172.21.15.81:6837/1044602586] -- client_request(unknown.0:166 rename #0x100000096da/f124 #0x603/10000009700 caller_uid=0, caller_gid=0{}) v6 -- 0x55801b0b2900 con 0x558016edbc00
...
2023-06-26T03:06:41.261+0000 7faa7d764640  1 -- [v2:172.21.15.81:6836/1044602586,v1:172.21.15.81:6837/1044602586] <== client.5652 172.21.15.81:0/443081920 26704 ==== client_request(client.5652:13731 rename #0x10000009641/dea #0x10000009670/dc7 2023-06-26T03:06:41.262717+0000 caller_uid=0, caller_gid=0{0,}) v6 ==== 302+0+0 (crc 0 0 0) 0x55801c9f7200 con 0x55801add3400

Then the unlink request was retried for the second time, but went to sleep by waiting for the linkmerge to finish:

2023-06-26T03:06:41.265+0000 7faa7d764640  7 mds.0.server dispatch_client_request client_request(client.5652:13729 unlink #0x100000096da/f124 2023-06-26T03:06:41.257367+0000 caller_uid=0, caller_gid=0{0,}) v6
2023-06-26T03:06:41.265+0000 7faa7d764640 10 mds.0.server rdlock_path_xlock_dentry request(client.5652:13729 nref=4 cr=0x558018a47800) #0x100000096da/f124
2023-06-26T03:06:41.265+0000 7faa7d764640 20 mds.0.server wait_for_pending_reintegrate dn [dentry #0x1/fsstress.85886.3/p3/d0/d69/d96/def/de9/dd0/d10a/d109/f124 [2,head] auth REMOTE(reg) (dn xlock x=1 by 0x558018667800) (dversion lock w=1 last_client=5652) v=8 ap=2 ino=0x10000009700 state=1610613120|unlinking|reintegrating | request=1 lock=2 purging=1 dirty=1 waiter=1 authpin=1 0x5580194a7680]
...
2023-06-26T03:06:41.273+0000 7faa7d764640 10 MDSContext::complete: 18C_MDS_RetryRequest
2023-06-26T03:06:41.273+0000 7faa7d764640  7 mds.0.server dispatch_client_request client_request(mds.0:166 rename #0x100000096da/f124 #0x603/10000009700 caller_uid=0, caller_gid=0{}) v6
2023-06-26T03:06:41.273+0000 7faa7d764640  7 mds.0.server handle_client_rename client_request(mds.0:166 rename #0x100000096da/f124 #0x603/10000009700 caller_uid=0, caller_gid=0{}) v6
2023-06-26T03:06:41.273+0000 7faa7d764640 10 mds.0.server rdlock_two_paths_xlock_destdn request(mds.0:166 nref=2 cr=0x55801b0b2900) #0x100000096da/f124 #0x603/10000009700
2023-06-26T03:06:41.273+0000 7faa7d764640  7 mds.0.cache traverse: opening base ino 0x100000096da snap head
2023-06-26T03:06:41.273+0000 7faa7d764640 12 mds.0.cache traverse: path seg depth 0 'f124' snapid head
2023-06-26T03:06:41.273+0000 7faa7d764640 20 mds.0.cache.dir(0x100000096da) lookup (f124, 'head')
2023-06-26T03:06:41.273+0000 7faa7d764640 20 mds.0.cache.dir(0x100000096da)   hit -> (f124,head)
...
2023-06-26T03:06:41.273+0000 7faa7d764640 20 mds.0.server wait_for_pending_unlink dn [dentry #0x1/fsstress.85886.3/p3/d0/d69/d96/def/de9/dd0/d10a/d109/f124 [2,head] auth REMOTE(reg) (dn xlock x=1 by 0x5580185fe000) (dversion lock w=1) v=8 ap=3 ino=0x10000009700 state=1610613120|unlinking|reintegrating | request=2 lock=2 purging=3 dirty=1 waiter=1 authpin=1 0x5580194a7680]
...

Then the unlink request and linkmerge were deadlock.

2023-06-26T03:09:44.332+0000 7faa7bf61640  0 log_channel(cluster) log [WRN] : 3 slow requests, 3 included below; oldest blocked for > 183.079131 secs
2023-06-26T03:09:44.332+0000 7faa7bf61640  0 log_channel(cluster) log [WRN] : slow request 183.072394 seconds old, received at 2023-06-26T03:06:41.264179+0000: client_request(mds.0:166 rename #0x100000096da/f124 #0x603/10000009700 caller_uid=0, caller_gid=0{}) currently acquired locks
2023-06-26T03:09:44.332+0000 7faa7bf61640  0 log_channel(cluster) log [WRN] : slow request 183.062359 seconds old, received at 2023-06-26T03:06:41.274214+0000: client_request(mds.0:167 rename #0x100000096da/f124 #0x603/10000009700 caller_uid=0, caller_gid=0{}) currently acquired locks
2023-06-26T03:09:44.332+0000 7faa7bf61640  0 log_channel(cluster) log [WRN] : slow request 183.079130 seconds old, received at 2023-06-26T03:06:41.257443+0000: client_request(client.5652:13729 unlink #0x100000096da/f124 2023-06-26T03:06:41.257367+0000 caller_uid=0, caller_gid=0{0,}) currently failed to xlock, waiting
Actions #3

Updated by Xiubo Li 10 months ago

  • Status changed from New to Fix Under Review
  • Priority changed from Normal to Urgent
  • Pull request ID set to 52199
Actions #4

Updated by Patrick Donnelly 9 months ago

  • Related to Bug #62096: mds: infinite rename recursion on itself added
Actions #5

Updated by Patrick Donnelly 8 months ago

  • Related to Bug #58340: mds: fsstress.sh hangs with multimds (deadlock between unlink and reintegrate straydn(rename)) added
Actions #6

Updated by Patrick Donnelly 8 months ago

  • Related to Bug #56695: [RHEL stock] pjd test failures(a bug that need to wait the unlink to finish) added
Actions #7

Updated by Xiubo Li 8 months ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to pacific, quincy, reef
Actions #8

Updated by Backport Bot 8 months ago

  • Copied to Backport #62858: quincy: mds: deadlock between unlink and linkmerge added
Actions #9

Updated by Backport Bot 8 months ago

  • Copied to Backport #62859: pacific: mds: deadlock between unlink and linkmerge added
Actions #10

Updated by Backport Bot 8 months ago

  • Copied to Backport #62860: reef: mds: deadlock between unlink and linkmerge added
Actions #11

Updated by Backport Bot 8 months ago

  • Tags set to backport_processed
Actions

Also available in: Atom PDF