Bug #40203
ceph df shows incorrect usage
0%
Description
TL;DR
"ceph df" shows pool "itaf" as 1.2TiB used, while "rbd du -p itaf" shows total used as 19TiB
What feedback do I need to provide to help with resolving this?
Longer version:
While trying to resize our placement groups, we noticed sudden significant drop in both pool usage and capacity.
We use LibreNMS as a graphing tool, and we suspected that it might be monitoring misbehaving.
However, when further discussed on IRC, we came to conclusion that Ceph itself has two views on the usage.
Command ceph df shows the same value as monitoring observes.
Autoscaler now suggests 16 PGs, while previously value of 256 was suggested.
We didn't had any actions at that time as far as we can track it.
We experience no issues for the virtualization that is running on top of Ceph.
This surfaced as our goal was to enable suggestions for PG from autoscaler, but we first needed to resize all pools.
We are now reluctant to have any significant actions (especially as autoscaler would still throw warnings).
It is hard to guess which information is relevant so I'll just provide filtered output of few commands:
# rbd du -p itaf
NAME PROVISIONED USED
...
<TOTAL> 29 TiB 19 TiB
#
# ceph df
...
POOL ID STORED OBJECTS USED %USED MAX AVAIL
itaf 1 413 GiB 4.91M 1.2 TiB 0.50 79 TiB
...
#
# ceph osd pool autoscale-status
POOL SIZE TARGET SIZE RATE RAW CAPACITY RATIO TARGET RATIO BIAS PG_NUM NEW PG_NUM AUTOSCALE
...
itaf 1223G 3.0 332.9T 0.0108 1.0 2048 16 warn
#
I am also attaching graph from LibreNMS of exact moment usage and capacity dropped.
Please let me know what information is relevant and I'll make sure to provide it.
Kind regards,
Momo.
Related issues
History
#1 Updated by Stephan Müller over 2 years ago
- Duplicated by Bug #41829: ceph df reports incorrect pool usage added
#2 Updated by Stephan Müller over 2 years ago
- Related to Bug #42982: Monitoring: alert for "pool full" wrong added
#3 Updated by Stephan Müller over 2 years ago
- Related to deleted (Bug #42982: Monitoring: alert for "pool full" wrong)
#4 Updated by Stephan Müller over 2 years ago
- Related to Bug #45185: mgr/dashboard: fix usage calculation to match "ceph df" way added
#5 Updated by Stephan Müller over 2 years ago
- Related to Feature #38697: mgr/dashboard: Enhance info shown in Landing Page cards 'PGs per OSD' & 'Raw Capacity' added