Backport #41000
Updated by Xiaoxi Chen over 4 years ago
https://github.com/ceph/ceph/pull/29321 when client get notification from MDS that a file has been deleted(via getting CEPH_CAP_LINK_SHARED cap for inode with nlink = 0, see handle_cap_grant), if the client hasnt touch the inode in the past, the ll_ref will be zero. In previous code, we only call Client::unlink when ll_ref > 0, which is wrong and will leave the dn in cache, keeping the caps and resulting the inode stays in stray till the dn cache is dropped by kernel. Under certain workload(write intensive and rotate intensive), this issue can cause stray stacking to several MILLIONS and causing huge space "leaking". <pre> 2019-07-24 02:09:03.527 7eff12ffd700 5 client.231690 handle_cap_grant on in 0x1000000279f mds.0 seq 3 caps now pAsLsXsFscr was pAsXsFscr (stale) 2019-07-24 02:09:03.527 7eff12ffd700 10 client.231690 update_inode_file_time 0x1000000279f.head(faked_ino=0 ref=1 ll_ref=0 cap_refs={} open={} mode=100644 size=0/0 nlink=0 btime=0.000000 mtime=2019-07-24 02:04:01.319475 ctime=2019-07-24 02:04:01.326440 caps=pAsXsFscr(0=pAsXsFscr) objectset[0x1000000279f ts 0/0 objects 0 dirty_or_tx 0] parents=0x1000000279d.head["b"] 0x7eff18002940) pAsXsFscr ctime 2019-07-24 02:09:02.074122 mtime 2019-07-24 02:04:01.319475 2019-07-24 02:09:03.527 7eff12ffd700 10 client.231690 grant, new caps are Ls </pre> Reproduce: I have two client(14.2.2 fuse) mounting same ceph-fs, /mnt/xiaoxi has 3 files, a, b and c. Client A: ls /mnt/xiaoxi Client B: ls /mnt/xiaoxi Client A: rm /mnt/xiaoxi/b Client B(right after the rm): ls /mnt/xiaoxi After that, the b will stay in stray forever as client B holding pAsLsXsFscr, Client A does release all its caps