Support #12177
open
removing cached objects doesn't quickly remove tiered ec objects
Added by Alexandre Oliva almost 9 years ago.
Updated over 8 years ago.
Description
Attempting to free up space by removing large files from cephfs causes the mds to truncate and then remove the objects that were still in the non-ec tiered pool, but those in the ec pool appear to remain in the osd apparently forever, even without truncation, even after rados stat reports the objects no longer exist. I haven't been able to track down how the remove operation is supposed to get to the ec osd, and a debugger attached to ceph-osd waiting for ec remove ops doesn't seem to get any, so I guess something is amiss (0.94.2).
- Tracker changed from Bug to Support
That's just part of how cache tiering works — deleted objects will get deleted when the whiteout entries in the cache pool get flushed out to the EC pool. I think there's some command you can run to force the OSDs to do that, but I'm not sure what it is...
- Subject changed from removing cephfs files tiered to ec pools doesn't remove ec objects to removing cached objects doesn't quickly remove tiered ec objects
A couple days ago I did investigation on the same issue. Indeed whiteout cache entry removal happens on cache flushing. And this takes place when specific conditions are met. They depend primarily on 'target_max_bytes' and/or 'target_max_objects' settings and total cacheв objects size and/or entry count. The issue is that target_max_objects isn't set by default thus cache entry count is unlimited. And removed objects are supposed to use 0 bytes thus 'target_max_bytes' threshold isn't applied too. As a result cache wouldn't flush unless there is additional data flow through the cache. I published a patch on dev-list that takes into account space used for removed cache entries - this can help a bit when massive removals on empty cache takes place. But probably it's even better to add forced cache flush when removed objects are there.
As a workaround one can also try to set target_max_objects param for the pool to some small value - this will cause cache flushing.
- Related to Bug #13848: Removed objects may stay in cache forever added
Also available in: Atom
PDF