Actions
Bug #54044
closedintermittent hangs waiting for caps
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
I've seen a problem with my own testing that only crops up randomly. Occasionally, I see processes get stuck waiting for caps to do an operation (a read in this stack):
[root@centos8 ceph]# cat /proc/3685/stack [<0>] wait_woken+0x2c/0x60 [<0>] ceph_get_caps+0x3f2/0x610 [ceph] [<0>] ceph_read_iter+0x140/0xbe0 [ceph] [<0>] new_sync_read+0x10f/0x150 [<0>] vfs_read+0x91/0x140 [<0>] ksys_pread64+0x61/0xa0 [<0>] do_syscall_64+0x5b/0x1a0 [<0>] entry_SYSCALL_64_after_hwframe+0x65/0xca
When I go to look at the caps list, I can see that it's waiting for Fr caps, but it appears to already have them!
[root@centos8 31eed9b8-785f-11ec-bfe7-52540031ba78.client144465]# cat caps total 20 avail 15 used 5 reserved 0 min 1024 ino mds issued implemented -------------------------------------------------- 0x10000035a1e 0 pAsLsXs pAsLsXs 0x10000035a1c 0 pAsLsXsFs pAsLsXsFs 0x100000357aa 0 pAsLsXsFs pAsLsXsFs 0x10000035a21 0 pAsLsXs pAsLsXs 0x10000035a22 2 pAsxLsXsxFsxcrwb pAsxLsXsxFsxcrwb Waiters: -------- tgid ino need want ----------------------------------------------------- 3685 0x10000035a22 Fr Fc
This is either a race in how we're handling this wait, or we're missing some wakeups.
Actions