Project

General

Profile

Actions

Bug #58340

closed

mds: fsstress.sh hangs with multimds (deadlock between unlink and reintegrate straydn(rename))

Added by Venky Shankar over 1 year ago. Updated 9 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Q/A
Tags:
backport_processed
Backport:
reef,pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDS
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/vshankar-2022-12-21_14:01:01-fs-wip-vshankar-testing-20221215.112736-testing-default-smithi/7123518

The MDS is kind of stuck in some lock transition and not making progress servicing client requests.

2022-12-21T15:08:32.659+0000 1950e700  7 mds.1.server reply_client_request 0 ((0) Success) client_request(client.4536:15443 unlink #0x40000000206/c24 2022-12-21T15:08:32.520905+0000 caller_uid=1000, caller_gid=1265{6,36,1000,1265,}) v6
2022-12-21T15:08:32.659+0000 1950e700 10 mds.1.server apply_allocated_inos 0x0 / [] / 0x0
2022-12-21T15:08:32.659+0000 1950e700  7 mds.1.locker local_wrlock_finish  on (iversion lock w=1 last_client=4536) on [inode 0x101 [...2,head] ~mds1/ auth v21 ap=1 snaprealm=0xf46a930 DIRTYPARENT f(v0 10=0+10) n(v5 rc2022-12-21T15:08:32.520905+0000 b4194304 14=3+11)/n(v0 rc2022-12-21T14:44:28.849089+0000 11=0+11) (inest lock w=1) (iversion lock w=1 last_client=4536) | dirtyscattered=0 lock=2 dirfrag=1 dirtyparent=1 dirty=1 authpin=1 0xf46a0e0]
2022-12-21T15:08:32.659+0000 1950e700  7 mds.1.locker wrlock_finish on (inest lock w=1) on [inode 0x101 [...2,head] ~mds1/ auth v21 ap=1 snaprealm=0xf46a930 DIRTYPARENT f(v0 10=0+10) n(v5 rc2022-12-21T15:08:32.520905+0000 b4194304 14=3+11)/n(v0 rc2022-12-21T14:44:28.849089+0000 11=0+11) (inest lock w=1) (iversion lock) | dirtyscattered=0 lock=1 dirfrag=1 dirtyparent=1 dirty=1 authpin=1 0xf46a0e0]
2022-12-21T15:08:32.659+0000 1950e700 10 mds.1.locker scatter_eval (inest lock) on [inode 0x101 [...2,head] ~mds1/ auth v21 ap=1 snaprealm=0xf46a930 DIRTYPARENT f(v0 10=0+10) n(v5 rc2022-12-21T15:08:32.520905+0000 b4194304 14=3+11)/n(v0 rc2022-12-21T14:44:28.849089+0000 11=0+11) (inest lock) (iversion lock) | dirtyscattered=0 lock=0 dirfrag=1 dirtyparent=1 dirty=1 authpin=1 0xf46a0e0]
2022-12-21T15:08:32.660+0000 1950e700  7 mds.1.locker local_wrlock_finish  on (iversion lock w=1 last_client=4536) on [inode 0x60b [...2,head] ~mds1/stray1/ auth v56 ap=1 DIRTYPARENT f(v0 m2022-12-21T15:08:32.520905+0000 2=2+0) n(v0 rc2022-12-21T15:08:32.520905+0000 b4194304 3=2+1) (inest lock w=1) (ifile lock w=1) (iversion lock w=1 last_client=4536) | lock=3 dirfrag=1 stickydirs=1 stray=1 dirtyparent=1 dirty=1 authpin=1 0xf475be0]
2022-12-21T15:08:32.660+0000 1950e700  7 mds.1.locker wrlock_finish on (ifile lock w=1) on [inode 0x60b [...2,head] ~mds1/stray1/ auth v56 ap=1 DIRTYPARENT f(v0 m2022-12-21T15:08:32.520905+0000 2=2+0) n(v0 rc2022-12-21T15:08:32.520905+0000 b4194304 3=2+1) (inest lock w=1) (ifile lock w=1) (iversion lock) | lock=2 dirfrag=1 stickydirs=1 stray=1 dirtyparent=1 dirty=1 authpin=1 0xf475be0]
2022-12-21T15:08:32.660+0000 1950e700  7 mds.1.locker file_eval wanted= loner_wanted= other_wanted=  filelock=(ifile lock) on [inode 0x60b [...2,head] ~mds1/stray1/ auth v56 ap=1 DIRTYPARENT f(v0 m2022-12-21T15:08:32.520905+0000 2=2+0) n(v0 rc2022-12-21T15:08:32.520905+0000 b4194304 3=2+1) (inest lock w=1) (ifile lock) (iversion lock) | lock=1 dirfrag=1 stickydirs=1 stray=1 dirtyparent=1 dirty=1 authpin=1 0xf475be0]
2022-12-21T15:08:32.660+0000 1950e700  7 mds.1.locker file_eval stable, bump to sync (ifile lock) on [inode 0x60b [...2,head] ~mds1/stray1/ auth v56 ap=1 DIRTYPARENT f(v0 m2022-12-21T15:08:32.520905+0000 2=2+0) n(v0 rc2022-12-21T15:08:32.520905+0000 b4194304 3=2+1) (inest lock w=1) (ifile lock) (iversion lock) | lock=1 dirfrag=1 stickydirs=1 stray=1 dirtyparent=1 dirty=1 authpin=1 0xf475be0]
2022-12-21T15:08:32.660+0000 1950e700  7 mds.1.locker simple_sync on (ifile lock) on [inode 0x60b [...2,head] ~mds1/stray1/ auth v56 ap=1 DIRTYPARENT f(v0 m2022-12-21T15:08:32.520905+0000 2=2+0) n(v0 rc2022-12-21T15:08:32.520905+0000 b4194304 3=2+1) (inest lock w=1) (ifile lock) (iversion lock) | lock=1 dirfrag=1 stickydirs=1 stray=1 dirtyparent=1 dirty=1 authpin=1 0xf475be0]
2022-12-21T15:08:32.661+0000 1950e700  7 mds.1.locker wrlock_finish on (inest lock w=1) on [inode 0x60b [...2,head] ~mds1/stray1/ auth v56 ap=1 DIRTYPARENT f(v0 m2022-12-21T15:08:32.520905+0000 2=2+0) n(v0 rc2022-12-21T15:08:32.520905+0000 b4194304 3=2+1) (inest lock w=1) (iversion lock) | lock=1 dirfrag=1 stickydirs=1 stray=1 dirtyparent=1 dirty=1 authpin=1 0xf475be0]
2022-12-21T15:08:32.661+0000 1950e700 10 mds.1.locker scatter_eval (inest lock) on [inode 0x60b [...2,head] ~mds1/stray1/ auth v56 ap=1 DIRTYPARENT f(v0 m2022-12-21T15:08:32.520905+0000 2=2+0) n(v0 rc2022-12-21T15:08:32.520905+0000 b4194304 3=2+1) (inest lock) (iversion lock) | lock=0 dirfrag=1 stickydirs=1 stray=1 dirtyparent=1 dirty=1 authpin=1 0xf475be0]
2022-12-21T15:08:32.661+0000 1950e700 10 mds.1.locker xlock_finish on (dn xlockdone l x=1) [dentry #0x1/client.0/tmp/tmp.UQxwie74AM/p5/d0/d2/d14/d37/c24 [2,head] auth NULL (dn xlockdone l x=1) (dversion lock w=1 last_client=4536) v=350 ap=2 ino=(nil) state=1610612736 | request=1 lock=2 inodepin=0 replicated=0 dirty=1 authpin=1 clientlease=1 0x1acdffd0]
2022-12-21T15:08:32.661+0000 1950e700 10 mds.1.locker eval_gather (dn xlockdone l) on [dentry #0x1/client.0/tmp/tmp.UQxwie74AM/p5/d0/d2/d14/d37/c24 [2,head] auth NULL (dn xlockdone l) (dversion lock w=1 last_client=4536) v=350 ap=2 ino=(nil) state=1610612736 | request=1 lock=1 inodepin=0 replicated=0 dirty=1 authpin=1 clientlease=1 0x1acdffd0]
2022-12-21T15:08:32.661+0000 1950e700  7 mds.1.locker eval_gather finished gather on (dn xlockdone l) on [dentry #0x1/client.0/tmp/tmp.UQxwie74AM/p5/d0/d2/d14/d37/c24 [2,head] auth NULL (dn xlockdone l) (dversion lock w=1 last_client=4536) v=350 ap=2 ino=(nil) state=1610612736 | request=1 lock=1 inodepin=0 replicated=0 dirty=1 authpin=1 clientlease=1 0x1acdffd0]
2022-12-21T15:08:32.662+0000 1950e700 10 mds.1.cache.den(0x40000000206 c24) auth_unpin by 0x1ace00b0 on [dentry #0x1/client.0/tmp/tmp.UQxwie74AM/p5/d0/d2/d14/d37/c24 [2,head] auth NULL (dn sync l) (dversion lock w=1 last_client=4536) v=350 ap=1 ino=(nil) state=1610612736 | request=1 lock=1 inodepin=0 replicated=0 dirty=1 authpin=1 clientlease=1 0x1acdffd0] now 1
2022-12-21T15:08:32.662+0000 1950e700 15 mds.1.cache.dir(0x40000000206) adjust_nested_auth_pins -1 on [dir 0x40000000206 /client.0/tmp/tmp.UQxwie74AM/p5/d0/d2/d14/d37/ [2,head] auth{0=1,3=1} v=351 cv=0/0 dir_auth=1 ap=1+1 state=1610874881|complete f(v0 m2022-12-21T15:08:32.520905+0000 11=8+3)/f(v0 m2022-12-21T15:08:10.789215+0000 12=9+3) n(v12 rc2022-12-21T15:08:32.520905+0000 b31204703 35=26+9)/n(v12 rc2022-12-21T15:08:23.108040+0000 b30920189 35=27+8) hs=11+4,ss=0+0 dirty=15 | ptrwaiter=0 child=1 frozen=0 subtree=1 importing=0 replicated=1 dirty=1 authpin=1 0x1cd27040] by 0x1ace00b0 count now 1/1

Related issues 7 (1 open6 closed)

Related to CephFS - Bug #59134: mds: deadlock during unlink with multimds (postgres)Duplicate

Actions
Related to CephFS - Bug #62096: mds: infinite rename recursion on itselfDuplicatePatrick Donnelly

Actions
Related to CephFS - Bug #61818: mds: deadlock between unlink and linkmergePending BackportXiubo Li

Actions
Related to CephFS - Bug #56695: [RHEL stock] pjd test failures(a bug that need to wait the unlink to finish)ResolvedXiubo Li

Actions
Copied to CephFS - Backport #61346: pacific: mds: fsstress.sh hangs with multimds (deadlock between unlink and reintegrate straydn(rename))ResolvedXiubo LiActions
Copied to CephFS - Backport #61347: reef: mds: fsstress.sh hangs with multimds (deadlock between unlink and reintegrate straydn(rename))ResolvedXiubo LiActions
Copied to CephFS - Backport #61348: quincy: mds: fsstress.sh hangs with multimds (deadlock between unlink and reintegrate straydn(rename))ResolvedXiubo LiActions
Actions

Also available in: Atom PDF