Project

General

Profile

Actions

Bug #16933

closed

the %USED of "ceph df" is wrong

Added by Kefu Chai over 7 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Support
Tags:
Backport:
hammer, jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

$ ceph osd df

ID WEIGHT  REWEIGHT SIZE   USE    AVAIL %USE  VAR  PGS
 1 0.00999  1.00000 10228M  9734M  494M 95.16 1.03 128
 0 0.00999  1.00000 10228M  9734M  494M 95.16 1.03 128
 2 0.00999  1.00000 10228M  9254M  974M 90.47 0.97 128
 3 0.00999  1.00000 10228M  9254M  974M 90.47 0.97 128
              TOTAL 40915M 37976M 2939M 92.82
$ ceph df

POOLS:
    NAME      ID     USED      %USED     MAX AVAIL     OBJECTS
    pool1     1      9216M     45.05          974M           9
    pool2     2      9696M     47.39          494M          10

per the crush dump

pool1 uses the "default" rule, which in turn selects and osd.2, osd.3.
pool2 uses the "test_profile" rule, which in turn selects osd.0, osd.1.

both pools' size is 2.

per sam

USED reflects the nominal amount stored by the pool without accounting for overhead or replication (1000 MB of objects is 1000 MB of USED even if the actual on disk usage is much higher due to replication and overhead).

and "MAX AVAIL" reflects the free space available for storing objects in the assigned OSDs. in other words, if we do not put more data or remove existing data in other pools, we can store 974MB objects into pool1 in addition to existing objects.

USED is not a number close to 90, as we expect, because the OSDs assigned to pool1 are almost full. take pool1 for example, it's calculated using

(9216*2)/(10228*4) = 0.4505

  • since the pool1.size is 2, we have two copies of the data, the raw space used by the 9216M objects is 2 * 9216M.
  • the raw space offered by each OSD is 10228M. and the total raw space offered by the cluster is 4 * 10228M

the percentage is not a good indicator of how full this pool is, but just a ratio of "the raw space used by this pool" / "the raw space of the whole cluster", even not all OSDs in the cluster are assigned to this pool.

so indeed, we should use

 USED / (USED + AVAIL)

to calc the %USED.


Related issues 2 (0 open2 closed)

Copied to Ceph - Backport #17120: hammer: the %USED of "ceph df" is wrongResolvedNathan CutlerActions
Copied to Ceph - Backport #17121: jewel: the %USED of "ceph df" is wrongResolvedLoïc DacharyActions
Actions #1

Updated by Kefu Chai over 7 years ago

  • Backport changed from jewel to hammer, jewel
Actions #2

Updated by Kefu Chai over 7 years ago

  • Subject changed from the %USED "ceph df" is wrong to the %USED of "ceph df" is wrong
  • Status changed from In Progress to Fix Under Review
Actions #3

Updated by Yuri Weinstein over 7 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #4

Updated by Loïc Dachary over 7 years ago

Actions #5

Updated by Loïc Dachary over 7 years ago

Actions #6

Updated by Nathan Cutler over 7 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF