Support #42449
openFlushing cache pool will take months
0%
Description
We have an EC pool for our main radosgw data. In front of this pool is/was a 3x replicated cache tier pool with "writeback" cache mode.
We are running nautilus 14.2.4 on Ubuntu 18.04. We have 9 nodes and about 260 HDD OSDs with bluestore (no SSD db/blocks). All nodes have 40Gb network - so this should not be an issue.
pool 35 'default.rgw.buckets.data' erasure size 12 min_size 11 crush_rule 5 object_hash rjenkins pg_num 2048 pgp_num 2048 autoscale_mode warn last_change 69376 lfor 9907/63184/69271 flags hashpspool tiers 42 read_tier 42 write_tier 42 stripe_width 40960 application rgw pool 42 'rgw-cache' replicated size 3 min_size 2 crush_rule 0 object_hash rjenkins pg_num 64 pgp_num 64 autoscale_mode warn last_change 69376 lfor 9907/9907/9907 flags hashpspool,incomplete_clones tier_of 35 cache_mode writeback target_bytes 25288767438848 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 3600s x0 decay_rate 0 search_last_n 0 stripe_width 0 application rgw
Since I strongly suspect that we actually do not gain anything from this cache I want to get rid of it and followed the instructions here: https://docs.ceph.com/docs/nautilus/rados/operations/cache-tiering/#removing-a-writeback-cache
I set the cache-mode to proxy, as suggested in the docs.
Running "rados -p {cachepool} cache-flush-evict-all" is already doing its thing for multiple days now - and it will likely take months or years to complete. It prints every object that has been flushed.
"rados -p rgw-cache ls | wc -l" shows about 6.5 million objects.
i/o cache flush speed is something around 5-10MiB/s (as displayed by "ceph status")
I don't have hard numbers, but by looking at the "ls" command every once in a while, I would estimate that it flushes 10k objects per day.
Do I need to wait for all objects from the cache pool to be flushed back into the main data pool, or can I remove the cache tier early?
Is there a way to flush everything at once and utilize the full bandwidth / network / throughput of the cluster?
Is there a way auto-flush old objects (e.g., after 24 hours) so I don't have to manually flush them with "cache-flush-evict-all"?