Project

General

Profile

Actions

Bug #48385

closed

nautilus: statfs: a cluster with any up but out osd will report bytes_used == stored

Added by Dan van der Ster over 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The pool df stats are supposed to preset user bytes as "stored" and raw used as "bytes_used".

But if any osd is up with reweight 0.0 (i.e. out) then the bytes_used is == stored, without the factor for replication or EC.

E.g. with 2 osds up but out (as shown from osd df tree, see https://termbin.com/9mx1):

 100   hdd   10.91600        0     0 B     0 B     0 B     0 B     0 B     0 B     0    0   0     up                     osd.100          
 177   hdd   10.91600        0     0 B     0 B     0 B     0 B     0 B     0 B     0    0   0     up                     osd.177          

Here are the stats with those osds out:

# ceph df
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED 
    hdd       5.5 PiB     1.2 PiB     4.3 PiB      4.3 PiB         78.46 
    TOTAL     5.5 PiB     1.2 PiB     4.3 PiB      4.3 PiB         78.46 

POOLS:
    POOL       ID     STORED      OBJECTS     USED        %USED     MAX AVAIL 
    public     68     2.9 PiB     143.56M     2.9 PiB     78.48       538 TiB 
    test       71      29 MiB       6.56k      29 MiB         0       269 TiB 
    foo        72     1.2 GiB         308     1.2 GiB         0       269 TiB 

Then I simply stopped the two out osds, and now the stats are correct:

# ceph df
RAW STORAGE:
    CLASS     SIZE        AVAIL       USED        RAW USED     %RAW USED 
    hdd       5.5 PiB     1.2 PiB     4.3 PiB      4.3 PiB         78.44 
    TOTAL     5.5 PiB     1.2 PiB     4.3 PiB      4.3 PiB         78.44 

POOLS:
    POOL       ID     STORED      OBJECTS     USED        %USED     MAX AVAIL 
    public     68     2.9 PiB     143.32M     4.3 PiB     84.36       544 TiB 
    test       71      29 MiB       6.59k     1.2 GiB         0       272 TiB 
    foo        72     1.2 GiB         308     3.6 GiB         0       272 TiB 

I fixed another affected cluster the same way -- found an up but out OSD, and stopped it.


Related issues 1 (0 open1 closed)

Related to mgr - Bug #46440: mgr: don't update osd stat which is already outResolvedZhi Zhang

Actions
Actions

Also available in: Atom PDF