Bug #44947
openHung ops for evicted CephFS clients do not get cleaned up fully
0%
Description
Hello,
After noticing some hung CephFS operations on my client, I rebooted the client. Ceph has evicted and blacklisted this client, and the hung operations have progressed to the "cleaned up request" event, but they are still listed by dump_ops_in_flight are preventing the rebooted client (which has been assigned a new client ID on remounting) from accessing the same inode. New attempts to access this inode result in additional hung operations. The only way I found to clear the hung ops completely and restore access to the inode was to restart my MDS.
I would have expected Ceph to terminate all operations for a client when that client is evicted. Is this behaviour configurable? Are there additional diags I can collect if this reoccurs?
Details of the current setup:
• ceph version 14.2.5 (ad5bd132e1492173c85fda2cc863152730b16a92) nautilus (stable)
• We're using the ceph kernel driver, kernel: 5.5.7-1.el7.elrepo.x86_64
• The client server has 38 separate directories mounted, all from the same CephFS filesystem.
• All 38 directories are mounted with the same config by three separate clients.
• Mount config (in fstab): 10.225.44.236,10.225.44.237,10.225.44.238:6789:/albacore/system/deploy on /opt/dcl/deploy type ceph (rw,noatime,name=albacore,secret=<hidden>,acl,wsize=32768,rsize=32768,_netdev)
Timeline:
1) 2020-03-28 21:38:58 - a cephFS op from client:366380 on inode .tmp_depl_license_status.svr01 gets stuck at "failed to wrlock, waiting" (see dump_ops_in_flight). Other ops for the same inode over the course of the next few days get stuck in a "dispatched" state (again see dump_ops_in_flight). Ceph health reports multiple slow ops.
2) 2020-03-30 11:11:44.582 - the client server is rebooted (with a "reboot" command from the shell). Ceph MDS logs show us evicting client session 366380. The client no longer appears in the output of `ceph tell mds.0 client ls`
3) 2020-03-30 11:11:44.664068 onwards - all the existing ops_in_flight for this client progress through events "failed to wrlock, waiting", "killing request", "cleaned up request" but the ops are still in the ops_in_flight list and still count towards ceph's slow ops count. The client no longer records these ops under /sys/kernel/debug/ceph/*/mdsc
4) 2020-03-30 11:18:37 - the client server comes back online, remounts the directory from CephFS, getting a new client session ID: 877605
5) 2020-03-30 11:21:06 - client:877605 tries to access the inode in question (.tmp_depl_license_status.svr01) and gets stuck in "failed to wrlock, waiting". More get caught behind it in "dispatched" state again, as before. These ops appear under /sys/kernel/debug/ceph/*/mdsc
Kind regards,
Dave
Files