We are seeing this bug on a newly deployed cluster across different versions(16.2.7 to 1.7.2). During the test, we rebuilt the cluster 3 times with clean OS and wiped drives.
Basically, the ceph df output is incorrect (STORE=USED) unless we add 4-5 pools to the cluster, this bug could be reproduced consistently when we add or decrease the pool counts.
=== Incorrect ceph df output ===
# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 852 TiB 851 TiB 866 GiB 866 GiB 0.10
TOTAL 852 TiB 851 TiB 866 GiB 866 GiB 0.10
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 1 0 B 0 0 B 0 269 TiB
vm.store 2 32 550 GiB 140.81k 550 GiB 0.07 269 TiB
=== Correct ceph df output after adding more pools ===
# ceph df
--- RAW STORAGE ---
CLASS SIZE AVAIL USED RAW USED %RAW USED
ssd 852 TiB 851 TiB 866 GiB 866 GiB 0.10
TOTAL 852 TiB 851 TiB 866 GiB 866 GiB 0.10
--- POOLS ---
POOL ID PGS STORED OBJECTS USED %USED MAX AVAIL
device_health_metrics 1 1 0 B 0 0 B 0 269 TiB
vm.store 2 32 550 GiB 140.81k 850 GiB 0.10 269 TiB
dbslave1 33 32 0 B 0 0 B 0 269 TiB
dbslave2 34 32 0 B 0 0 B 0 269 TiB