Project

General

Profile

Backport #41000

Updated by Xiaoxi Chen over 4 years ago

https://github.com/ceph/ceph/pull/29321 when client get notification from MDS that a file has been deleted(via 
 getting CEPH_CAP_LINK_SHARED cap for inode with nlink = 0, see handle_cap_grant), 
 if the client hasnt touch the inode in the past, the ll_ref will be zero. 

 In previous code, we only call Client::unlink when ll_ref > 0, which is wrong 
 and will leave the dn in cache, keeping the caps and resulting the inode stays 
 in stray till the dn cache is dropped by kernel. 

 Under certain workload(write intensive and rotate intensive), this issue can cause 
 stray stacking to several MILLIONS and causing huge space "leaking". 


 <pre> 
 2019-07-24 02:09:03.527 7eff12ffd700    5 client.231690 handle_cap_grant on in 0x1000000279f mds.0 seq 3 caps now pAsLsXsFscr was pAsXsFscr (stale) 

 2019-07-24 02:09:03.527 7eff12ffd700 10 client.231690 update_inode_file_time 0x1000000279f.head(faked_ino=0 ref=1 ll_ref=0 cap_refs={} open={} mode=100644 size=0/0 nlink=0 btime=0.000000 mtime=2019-07-24 02:04:01.319475 ctime=2019-07-24 02:04:01.326440 caps=pAsXsFscr(0=pAsXsFscr) objectset[0x1000000279f ts 0/0 objects 0 dirty_or_tx 0] parents=0x1000000279d.head["b"] 0x7eff18002940) pAsXsFscr ctime 2019-07-24 02:09:02.074122 mtime 2019-07-24 02:04:01.319475 

 2019-07-24 02:09:03.527 7eff12ffd700 10 client.231690     grant, new caps are Ls 

 </pre> 

 Reproduce: 


 I have two client(14.2.2 fuse) mounting same ceph-fs, 
 /mnt/xiaoxi has 3 files, a, b and c. 

 Client A: 
       ls /mnt/xiaoxi 
 Client B: 
       ls /mnt/xiaoxi 


 Client A: 
       rm /mnt/xiaoxi/b 
 Client B(right after the rm): 
       ls /mnt/xiaoxi 


 After that, the b will stay in stray forever as    client B holding pAsLsXsFscr,    Client A does release all its caps

Back