monitor DB grows without bound during rebalance
We have a very large cluster of about 680 OSDs across 18 storage servers. The largest and most active pool is our RGW data pool which is an 12+4 EC pool (host failure domain).
A few months ago after upgrading to Pacific (from Mimic), we got into a badly unbalanced situation and have never recovered. Eventually, a large number of our OSDs were well over 95% full and yet continued to grow even though no new data was being written to the cluster. Additionally, even after setting norebalance, nobackfill, disabling the PG autoscaler, the monitor DBs continue to grow until they consume all of the on-disk space (800GB) and have to be manually shut down and compacted offline (on a separate host with enough space) and then brought back online.
We eventually shut down all ceph services on all nodes and tried to manually move PGs using ceph-objectstore-tool from full OSDs to less-full OSDs on the same host (to avoid crossing crush boundaries). After bringing up the cluster with all of the "no" flags enabled and even after setting "osd pause", the monitors again continued growing without bound and we were unable to get them into quorum and OSDs appeared to be once again growing and eating up any leftover space on the device.Several issues that appear to be bugs or at least need some clarification:
- why do OSDs continue to consume more and more space when no data is being written and rebalance and backfill are disabled?
- why do the Monitor DBs continue to grow in size when.
Is there any way to quiet a system in this state long enough to slowly bring it back online and get it back into balance?