Bug #48385: nautilus: statfs: a cluster with any up but out osd will report bytes_used == stored - RADOS - Ceph

Actions

Copy link

Bug #48385

closed

nautilus: statfs: a cluster with any up but out osd will report bytes_used == stored

Added by Dan van der Ster over 3 years ago. Updated almost 3 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Igor Fedotov

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v14.2.11

ceph-qa-suite:

Component(RADOS):

Pull request ID:

38354

Crash signature (v1):

Crash signature (v2):

Description

The pool df stats are supposed to preset user bytes as "stored" and raw used as "bytes_used".

But if any osd is up with reweight 0.0 (i.e. out) then the bytes_used is == stored, without the factor for replication or EC.

E.g. with 2 osds up but out (as shown from osd df tree, see https://termbin.com/9mx1):

 100   hdd   10.91600        0     0 B     0 B     0 B     0 B     0 B     0 B     0    0   0     up                     osd.100          
 177   hdd   10.91600        0     0 B     0 B     0 B     0 B     0 B     0 B     0    0   0     up                     osd.177

Here are the stats with those osds out:

# ceph df
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED 
    hdd       5.5 PiB     1.2 PiB     4.3 PiB      4.3 PiB         78.46 
    TOTAL     5.5 PiB     1.2 PiB     4.3 PiB      4.3 PiB         78.46 

POOLS:
    POOL       ID     STORED      OBJECTS     USED        %USED     MAX AVAIL 
    public     68     2.9 PiB     143.56M     2.9 PiB     78.48       538 TiB 
    test       71      29 MiB       6.56k      29 MiB         0       269 TiB 
    foo        72     1.2 GiB         308     1.2 GiB         0       269 TiB

Then I simply stopped the two out osds, and now the stats are correct:

# ceph df
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED 
    hdd       5.5 PiB     1.2 PiB     4.3 PiB      4.3 PiB         78.44 
    TOTAL     5.5 PiB     1.2 PiB     4.3 PiB      4.3 PiB         78.44 

POOLS:
    POOL       ID     STORED      OBJECTS     USED        %USED     MAX AVAIL 
    public     68     2.9 PiB     143.32M     4.3 PiB     84.36       544 TiB 
    test       71      29 MiB       6.59k     1.2 GiB         0       272 TiB 
    foo        72     1.2 GiB         308     3.6 GiB         0       272 TiB

I fixed another affected cluster the same way -- found an up but out OSD, and stopped it.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Dan van der Ster over 3 years ago

My best guess is it's related to this in PGMap.h:

  bool use_per_pool_stats() const {
    return osd_sum.num_osds == osd_sum.num_per_pool_osds;
  }

so if an osd is up but out, that will return False which then engages "legacy mode" in osd_types.h get_user_bytes.

commit aacfa8f08cb7c916ffa821545615d4a5c2fa5b05 is relevant.

BTW, I think "up but out" is only one of several "normal" conditions in a modern ceph cluster that can erroneously trigger this legacy mode. We have grafana plots of bytes_used over time, and have wondered why they are glitchy during backfilling; now we know they are alternating between legacy and correct stats mode whenever PG peering causes use_per_pool_stats to temporarily return false.

Actions

Copy link

Updated by Igor Fedotov over 3 years ago

Status changed from New to In Progress
Assignee set to Igor Fedotov

I can reproduce the issue at vstart cluster with both latest Nautilus and Octopus. But not for master.
Looks like the following patch fixes the issue: https://github.com/ceph/ceph/pull/36002

Gonna backport it...

Actions

Copy link

Updated by Igor Fedotov over 3 years ago

Related to Bug #46440: mgr: don't update osd stat which is already out added

Actions

Copy link

Updated by Igor Fedotov over 3 years ago

Status changed from In Progress to Fix Under Review
Pull request ID set to 38354

Actions

Copy link

Updated by Neha Ojha over 3 years ago

Subject changed from statfs: a cluster with any up but out osd will report bytes_used == stored to nautilus: statfs: a cluster with any up but out osd will report bytes_used == stored

Actions

Copy link