Bug #48385
closednautilus: statfs: a cluster with any up but out osd will report bytes_used == stored
0%
Description
The pool df stats are supposed to preset user bytes as "stored" and raw used as "bytes_used".
But if any osd is up with reweight 0.0 (i.e. out) then the bytes_used is == stored, without the factor for replication or EC.
E.g. with 2 osds up but out (as shown from osd df tree, see https://termbin.com/9mx1):
100 hdd 10.91600 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up osd.100 177 hdd 10.91600 0 0 B 0 B 0 B 0 B 0 B 0 B 0 0 0 up osd.177
Here are the stats with those osds out:
# ceph df RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.46 TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.46 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL public 68 2.9 PiB 143.56M 2.9 PiB 78.48 538 TiB test 71 29 MiB 6.56k 29 MiB 0 269 TiB foo 72 1.2 GiB 308 1.2 GiB 0 269 TiB
Then I simply stopped the two out osds, and now the stats are correct:
# ceph df RAW STORAGE: CLASS SIZE AVAIL USED RAW USED %RAW USED hdd 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.44 TOTAL 5.5 PiB 1.2 PiB 4.3 PiB 4.3 PiB 78.44 POOLS: POOL ID STORED OBJECTS USED %USED MAX AVAIL public 68 2.9 PiB 143.32M 4.3 PiB 84.36 544 TiB test 71 29 MiB 6.59k 1.2 GiB 0 272 TiB foo 72 1.2 GiB 308 3.6 GiB 0 272 TiB
I fixed another affected cluster the same way -- found an up but out OSD, and stopped it.
Updated by Dan van der Ster over 3 years ago
My best guess is it's related to this in PGMap.h:
bool use_per_pool_stats() const { return osd_sum.num_osds == osd_sum.num_per_pool_osds; }
so if an osd is up but out, that will return False which then engages "legacy mode" in osd_types.h get_user_bytes.
commit aacfa8f08cb7c916ffa821545615d4a5c2fa5b05 is relevant.
BTW, I think "up but out" is only one of several "normal" conditions in a modern ceph cluster that can erroneously trigger this legacy mode. We have grafana plots of bytes_used over time, and have wondered why they are glitchy during backfilling; now we know they are alternating between legacy and correct stats mode whenever PG peering causes use_per_pool_stats to temporarily return false.
Updated by Igor Fedotov over 3 years ago
- Status changed from New to In Progress
- Assignee set to Igor Fedotov
I can reproduce the issue at vstart cluster with both latest Nautilus and Octopus. But not for master.
Looks like the following patch fixes the issue: https://github.com/ceph/ceph/pull/36002
Gonna backport it...
Updated by Igor Fedotov over 3 years ago
- Related to Bug #46440: mgr: don't update osd stat which is already out added
Updated by Igor Fedotov over 3 years ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 38354
Updated by Neha Ojha over 3 years ago
- Subject changed from statfs: a cluster with any up but out osd will report bytes_used == stored to nautilus: statfs: a cluster with any up but out osd will report bytes_used == stored
Updated by Igor Fedotov over 3 years ago
- Status changed from Fix Under Review to Resolved