Bug #16933: the %USED of "ceph df" is wrong - Ceph - Ceph

Actions

Copy link

Bug #16933

closed

the %USED of "ceph df" is wrong

Added by Kefu Chai over 7 years ago. Updated over 7 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Kefu Chai

Category:

Target version:

% Done:

Source:

Support

Tags:

Backport:

hammer, jewel

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

$ ceph osd df

ID WEIGHT  REWEIGHT SIZE   USE    AVAIL %USE  VAR  PGS
 1 0.00999  1.00000 10228M  9734M  494M 95.16 1.03 128
 0 0.00999  1.00000 10228M  9734M  494M 95.16 1.03 128
 2 0.00999  1.00000 10228M  9254M  974M 90.47 0.97 128
 3 0.00999  1.00000 10228M  9254M  974M 90.47 0.97 128
              TOTAL 40915M 37976M 2939M 92.82
$ ceph df

POOLS:
    NAME      ID     USED      %USED     MAX AVAIL     OBJECTS
    pool1     1      9216M     45.05          974M           9
    pool2     2      9696M     47.39          494M          10

per the crush dump

pool1 uses the "default" rule, which in turn selects and osd.2, osd.3.
pool2 uses the "test_profile" rule, which in turn selects osd.0, osd.1.

both pools' size is 2.

per sam

USED reflects the nominal amount stored by the pool without accounting for overhead or replication (1000 MB of objects is 1000 MB of USED even if the actual on disk usage is much higher due to replication and overhead).

and "MAX AVAIL" reflects the free space available for storing objects in the assigned OSDs. in other words, if we do not put more data or remove existing data in other pools, we can store 974MB objects into pool1 in addition to existing objects.

USED is not a number close to 90, as we expect, because the OSDs assigned to pool1 are almost full. take pool1 for example, it's calculated using

(9216*2)/(10228*4) = 0.4505

since the pool1.size is 2, we have two copies of the data, the raw space used by the 9216M objects is 2 * 9216M.
the raw space offered by each OSD is 10228M. and the total raw space offered by the cluster is 4 * 10228M

the percentage is not a good indicator of how full this pool is, but just a ratio of "the raw space used by this pool" / "the raw space of the whole cluster", even not all OSDs in the cluster are assigned to this pool.

so indeed, we should use

 USED / (USED + AVAIL)

to calc the %USED.

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Updated by Kefu Chai over 7 years ago

Backport changed from jewel to hammer, jewel

Actions

Copy link

Updated by Kefu Chai over 7 years ago

Subject changed from the %USED "ceph df" is wrong to the %USED of "ceph df" is wrong
Status changed from In Progress to Fix Under Review

https://github.com/ceph/ceph/pull/10584

Actions

Copy link