Bug #40960
client: failed to drop dn and release caps causing mds stary stacking.
0%
Description
when client get notification from MDS that a file has been deleted(via
getting CEPH_CAP_LINK_SHARED cap for inode with nlink = 0, see handle_cap_grant),
if the client hasnt touch the inode in the past, the ll_ref will be zero.
In previous code, we only call Client::unlink when ll_ref > 0, which is wrong
and will leave the dn in cache, keeping the caps and resulting the inode stays
in stray till the dn cache is dropped by kernel.
Under certain workload(write intensive and rotate intensive), this issue can cause
stray stacking to several MILLIONS and causing huge space "leaking".
2019-07-24 02:09:03.527 7eff12ffd700 5 client.231690 handle_cap_grant on in 0x1000000279f mds.0 seq 3 caps now pAsLsXsFscr was pAsXsFscr (stale) 2019-07-24 02:09:03.527 7eff12ffd700 10 client.231690 update_inode_file_time 0x1000000279f.head(faked_ino=0 ref=1 ll_ref=0 cap_refs={} open={} mode=100644 size=0/0 nlink=0 btime=0.000000 mtime=2019-07-24 02:04:01.319475 ctime=2019-07-24 02:04:01.326440 caps=pAsXsFscr(0=pAsXsFscr) objectset[0x1000000279f ts 0/0 objects 0 dirty_or_tx 0] parents=0x1000000279d.head["b"] 0x7eff18002940) pAsXsFscr ctime 2019-07-24 02:09:02.074122 mtime 2019-07-24 02:04:01.319475 2019-07-24 02:09:03.527 7eff12ffd700 10 client.231690 grant, new caps are Ls
Reproduce:
I have two client(14.2.2 fuse) mounting same ceph-fs,
/mnt/xiaoxi has 3 files, a, b and c.
Client A:
ls /mnt/xiaoxi
Client B:
ls /mnt/xiaoxi
Client A:
rm /mnt/xiaoxi/b
Client B(right after the rm):
ls /mnt/xiaoxi
After that, the b will stay in stray forever as client B holding pAsLsXsFscr, Client A does release all its caps
Related issues
History
#1 Updated by Xiaoxi Chen over 4 years ago
- File Screen Shot 2019-07-25 at 10.30.22 PM.png View added
#2 Updated by Patrick Donnelly over 4 years ago
- Status changed from New to Fix Under Review
- Target version set to v15.0.0
- Start date deleted (
07/25/2019) - Backport changed from nautilus,mimic to nautilus,mimic,luminous
- Pull request ID set to 29321
#3 Updated by Patrick Donnelly over 4 years ago
- Subject changed from client failed to drop dn and release caps causing mds stary stacking. to client: failed to drop dn and release caps causing mds stary stacking.
- Status changed from Fix Under Review to Pending Backport
- Component(FS) deleted (
ceph-fuse)
#4 Updated by Xiaoxi Chen over 4 years ago
- Copied to Backport #41000: luminous: client: failed to drop dn and release caps causing mds stary stacking. added
#5 Updated by Xiaoxi Chen over 4 years ago
- Copied to Backport #41001: mimic: client: failed to drop dn and release caps causing mds stary stacking. added
#6 Updated by Xiaoxi Chen over 4 years ago
- Copied to Backport #41002: nautilus:client: failed to drop dn and release caps causing mds stary stacking. added
#7 Updated by Xiaoxi Chen over 4 years ago
some more background of this issue is under
https://tracker.ceph.com/issues/38679#note-9
#8 Updated by Nathan Cutler over 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".