Actions
Bug #46905
closedclient: cluster [WRN] evicting unresponsive client smithi122:0 (34373), after 304.762 seconds
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
libcephfs
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
"2020-08-12T01:33:01.526130+0000 mds.a (mds.0) 1 : cluster [WRN] evicting unresponsive client smithi122:0 (34373), after 304.762 seconds" in cluster log
From the code in _unmount() the inode ref will be increased by
anchor.emplace_back(in); _flush(in, new C_Client_FlushComplete(this, in));
But from https://pulpito.ceph.com/pdonnell-2020-08-08_02\:21\:01-multimds-wip-pdonnell-testing-20200808.001303-distro-basic-smithi/5319314/smithi157/log/ceph-client.0.29855.log.gz, we can see that the ref only decreased once before the client was blacklisted by MDS, and during this time the related osd op reply keeps going on:
2020-08-10T15:22:28.081+0000 7f54882511c0 15 inode.get on 0x7f545c062670 0x10000000352.head now 4 2020-08-10T15:22:28.081+0000 7f54882511c0 20 client.4593 _release 0x10000000352.head(faked_ino=0 ref=4 ll_ref=0 cap_refs={4=0,1024=1,4096=0,8192=1} open={2=0,3=0} mode=100700 size=104857600/213909504 nlink=0 btime=0.000000 mtime=2020-08-10T15:16:36.423961+0000 ctime=2020-08-10T15:22:08.870067+0000 caps=pAsLsXsFsxcrwb(0=pAsLsXsFsxcrwb) objectset[0x10000000352 ts 0/0 objects 25 dirty_or_tx 1347584] 0x7f545c062670) 2020-08-10T15:22:28.081+0000 7f54882511c0 15 inode.get on 0x7f545c062670 0x10000000352.head now 5 2020-08-10T15:22:28.081+0000 7f54882511c0 10 client.4593 _flush 0x10000000352.head(faked_ino=0 ref=5 ll_ref=0 cap_refs={4=0,1024=1,4096=0,8192=1} open={2=0,3=0} mode=100700 size=104857600/213909504 nlink=0 btime=0.000000 mtime=2020-08-10T15:16:36.423961+0000 ctime=2020-08-10T15:22:08.870067+0000 caps=pAsLsXsFsxcrwb(0=pAsLsXsFsxcrwb) objectset[0x10000000352 ts 0/0 objects 25 dirty_or_tx 1347584] 0x7f545c062670) ... 096=0,8192=1} open={2=0,3=0} mode=100700 size=104857600/213909504 nlink=0 btime=0.000000 mtime=2020-08-10T15:16:36.423961+0000 ctime=2020-08-10T15:22:08.870067+0000 caps=pAsLsXsFsxcrwb(0=pAsLsXsFsxcrwb) objectset[0x10000000352 ts 0/0 objects 25 dirty_or_tx 1347584] 0x7f545c062670) 2020-08-10T15:22:28.087+0000 7f54882511c0 15 inode.put on 0x7f545c062670 0x10000000352.head now 4 ... 2020-08-10T15:26:08.112+0000 7f54882511c0 1 client.4593 dump_cache 2020-08-10T15:26:08.112+0000 7f54882511c0 1 client.4593 dump_inode: DISCONNECTED inode 0x10000000352 #0x10000000352 ref 40x10000000352.head(faked_ino=0 ref=4 ll_ref=0 cap_refs={4=0,1024=1,4096=0,8192=1} open={2=0,3=0} mode=100700 size=104857600/213909504 nlink=0 btime=0.000000 mtime=2020-08-10T15:16:36.423961+0000 ctime=2020-08-10T15:22:08.870067+0000 caps=-(0=pAsLsXsFsxcrwb) objectset[0x10000000352 ts 0/0 objects 25 dirty_or_tx 24576] 0x7f545c062670)
Updated by Xiubo Li over 3 years ago
- Status changed from In Progress to Fix Under Review
Let's just defer cancelling the event timer.
Updated by Patrick Donnelly over 3 years ago
- Target version set to v16.0.0
- Source set to Q/A
Updated by Patrick Donnelly over 3 years ago
- Status changed from Fix Under Review to Resolved
- Pull request ID changed from 36574 to 36580
Actions