Project

General

Profile

Actions

Bug #41759

closed

mgr/volumes: test_async_subvolume_rm fails since purge threads did not cleanup trash directory

Added by Venky Shankar over 4 years ago. Updated over 4 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Tags:
Backport:
nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
mgr/volumes
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Patrick saw this recently here: http://qa-proxy.ceph.com/teuthology/pdonnell-2019-09-11_00:33:51-fs-wip-pdonnell-testing-20190910.212946-distro-basic-smithi/4297026/teuthology.log

From MDS log:

2019-09-11T03:06:23.745+0000 7fd8a3f61700  7 mds.0.server reply_client_request 0 ((0) Success) client_request(client.4673:7 rename #0x10000000013/aa40b188-8269-40fd-a029-0909c6ba664a #0x10000000001/subvolume_2815 2019-09-11T03:06:23.740171+0000 caller_uid=0, caller_gid=0{}) v4
2019-09-11T03:06:23.749+0000 7fd8a3f61700  7 mds.0.locker local_wrlock_finish  on (dversion lock w=1 last_client=4673) on [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dn xlock x=1 by 0x55bffd4f6c00) (dversion lock w=1 last_client=4673) v=4129 ap=2 ino=0x10000000002 state=1610612737|new | request=1 lock=2 inodepin=1 dirty=1 authpin=1 0x55bffdc172c0]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.locker xlock_finish on (dn xlock x=1 by 0x55bffd4f6c00) [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dn xlock x=1 by 0x55bffd4f6c00) (dversion lock) v=4129 ap=2 ino=0x10000000002 state=1610612737|new | request=1 lock=1 inodepin=1 dirty=1 authpin=1 0x55bffdc172c0]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.locker eval_gather (dn xlock) on [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dn xlock) (dversion lock) v=4129 ap=2 ino=0x10000000002 state=1610612737|new | request=1 lock=0 inodepin=1 dirty=1 authpin=1 0x55bffdc172c0]
2019-09-11T03:06:23.749+0000 7fd8a3f61700  7 mds.0.locker eval_gather finished gather on (dn xlock) on [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dn xlock) (dversion lock) v=4129 ap=2 ino=0x10000000002 state=1610612737|new | request=1 lock=0 inodepin=1 dirty=1 authpin=1 0x55bffdc172c0]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.cache.den(0x10000000013 aa40b188-8269-40fd-a029-0909c6ba664a) auth_unpin by 0x55bffdc17380 on [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dversion lock) v=4129 ap=1 ino=0x10000000002 state=1610612737|new | request=1 lock=0 inodepin=1 dirty=1 authpin=1 0x55bffdc172c0] now 1
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.locker simple_eval (dn sync) on [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dversion lock) v=4129 ap=1 ino=0x10000000002 state=1610612737|new | request=1 lock=0 inodepin=1 dirty=1 authpin=1 0x55bffdc172c0]
2019-09-11T03:06:23.749+0000 7fd8a3f61700  7 mds.0.locker local_wrlock_finish  on (iversion lock w=1 last_client=4673) on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=3 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink xlockdone x=1) (isnap xlockdone x=1) (inest lock) (ifile excl) (iversion lock w=1 last_client=4673) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=3 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.locker xlock_finish on (isnap xlockdone x=1) [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=3 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink xlockdone x=1) (isnap xlockdone x=1) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=2 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.locker eval_gather (isnap xlockdone) on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=3 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink xlockdone x=1) (isnap xlockdone) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=1 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700  7 mds.0.locker eval_gather finished gather on (isnap xlockdone) on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=3 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink xlockdone x=1) (isnap xlockdone) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=1 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.cache.ino(0x10000000002) auth_unpin by 0x55bffd452f00 on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=2 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink xlockdone x=1) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=1 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00] now 2
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.locker simple_eval (isnap sync) on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=2 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink xlockdone x=1) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=1 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.locker xlock_finish on (ilink xlockdone x=1) [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=2 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink xlockdone x=1) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=1 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.cache.ino(0x10000000002) auth_unpin by 0x55bffd452e50 on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=1 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink excl) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=0 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00] now 1
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.locker simple_eval (ilink excl) on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=1 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink excl) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=0 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700  7 mds.0.locker simple_eval stable, syncing (ilink excl) on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=1 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink excl) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=0 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700  7 mds.0.locker simple_sync on (ilink excl) on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=1 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (ilink excl) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=0 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 12 mds.0.server including    dn [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dversion lock) v=4129 ap=1 ino=0x10000000002 state=1610612737|new | request=1 lock=0 inodepin=1 dirty=1 authpin=1 0x55bffdc172c0]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 20 mds.0.locker issue_client_lease no/null lease on [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dversion lock) v=4129 ap=1 ino=0x10000000002 state=1610612737|new | request=1 lock=0 inodepin=1 dirty=1 authpin=1 0x55bffdc172c0]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 12 mds.0.server including inode [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=1 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@2},l=4673 | request=1 lock=0 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700  7 mds.0.locker issue_caps loner client.4673 allowed=pAsLsXsFsxcrwb, xlocker allowed=pAsLsXsFsxcrwb, others allowed=pAsLsXs on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 ap=1 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@3},l=4673 | request=1 lock=0 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=1 0x55bffd452a00]
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.cache.ino(0x10000000002) auth_unpin by 0x55bffd4f6c00 on [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@3},l=4673 | request=1 lock=0 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x55bffd452a00] now 0
2019-09-11T03:06:23.749+0000 7fd8a3f61700 10 mds.0.cache.den(0x10000000013 aa40b188-8269-40fd-a029-0909c6ba664a) auth_unpin by 0x55bffd4f6c00 on [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dversion lock) v=4129 ino=0x10000000002 state=1610612737|new | request=1 lock=0 inodepin=1 dirty=1 authpin=0 0x55bffdc172c0] now 0
2019-09-11T03:06:23.749+0000 7fd8aa76e700 12 mds.0.server including    dn [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dversion lock) v=4129 ino=0x10000000002 state=1610612737|new | request=0 lock=0 inodepin=1 dirty=1 authpin=0 0x55bffdc172c0]
2019-09-11T03:06:23.749+0000 7fd8aa76e700 20 mds.0.locker issue_client_lease no/null lease on [dentry #0x1/volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a [2,head] auth (dversion lock) v=4129 ino=0x10000000002 state=1610612737|new | request=0 lock=0 inodepin=1 dirty=1 authpin=0 0x55bffdc172c0]
2019-09-11T03:06:23.749+0000 7fd8aa76e700 12 mds.0.server including inode [inode 0x10000000002 [...2,head] /volumes/_deleting/aa40b188-8269-40fd-a029-0909c6ba664a/ auth fragtree_t(*^6) v4129 dirtyparent f(v0 m2019-09-11T03:06:17.067189+0000 1024=1024+0) n(v0 rc2019-09-11T03:06:23.740171+0000 b1073741824 1025=1024+1) (inest lock) (ifile excl) (iversion lock) caps={4673=pAsLsXsFsx/-@3},l=4673 | request=0 lock=0 dirfrag=64 caps=1 dirtyrstat=0 dirtyparent=1 dirty=1 waiter=0 authpin=0 0x55bffd452a00]

`aa40b188-8269-40fd-a029-0909c6ba664a` is the trash directory that's being removed by a purge thread.

When fixing this it seems that we need more debugability w/ purge threads. This could be as simple as logging queue of pending purges before starting a new purge or an interface to query ongoing and pending purges (even if the purge queue encounter an exception and bailout, we could still query pending purges). As simple as logging thread name/id in purge thread would help in following thread execution. Changing log level to log a bit more frequently would help too (candidate: rmtree() in src/pybing/mgr/volumes/fs/subvolume.py).

Actions

Also available in: Atom PDF