Project

General

Profile

Bug #16933

the %USED of "ceph df" is wrong

Added by Kefu Chai 12 months ago. Updated 8 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
Start date:
08/05/2016
Due date:
% Done:

0%

Source:
Support
Tags:
Backport:
hammer, jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Release:
Needs Doc:
No

Description

$ ceph osd df

ID WEIGHT  REWEIGHT SIZE   USE    AVAIL %USE  VAR  PGS
 1 0.00999  1.00000 10228M  9734M  494M 95.16 1.03 128
 0 0.00999  1.00000 10228M  9734M  494M 95.16 1.03 128
 2 0.00999  1.00000 10228M  9254M  974M 90.47 0.97 128
 3 0.00999  1.00000 10228M  9254M  974M 90.47 0.97 128
              TOTAL 40915M 37976M 2939M 92.82
$ ceph df

POOLS:
    NAME      ID     USED      %USED     MAX AVAIL     OBJECTS
    pool1     1      9216M     45.05          974M           9
    pool2     2      9696M     47.39          494M          10

per the crush dump

pool1 uses the "default" rule, which in turn selects and osd.2, osd.3.
pool2 uses the "test_profile" rule, which in turn selects osd.0, osd.1.

both pools' size is 2.

per sam

USED reflects the nominal amount stored by the pool without accounting for overhead or replication (1000 MB of objects is 1000 MB of USED even if the actual on disk usage is much higher due to replication and overhead).

and "MAX AVAIL" reflects the free space available for storing objects in the assigned OSDs. in other words, if we do not put more data or remove existing data in other pools, we can store 974MB objects into pool1 in addition to existing objects.

USED is not a number close to 90, as we expect, because the OSDs assigned to pool1 are almost full. take pool1 for example, it's calculated using

(9216*2)/(10228*4) = 0.4505

  • since the pool1.size is 2, we have two copies of the data, the raw space used by the 9216M objects is 2 * 9216M.
  • the raw space offered by each OSD is 10228M. and the total raw space offered by the cluster is 4 * 10228M

the percentage is not a good indicator of how full this pool is, but just a ratio of "the raw space used by this pool" / "the raw space of the whole cluster", even not all OSDs in the cluster are assigned to this pool.

so indeed, we should use

 USED / (USED + AVAIL)

to calc the %USED.


Related issues

Copied to Ceph - Backport #17120: hammer: the %USED of "ceph df" is wrong Resolved
Copied to Ceph - Backport #17121: jewel: the %USED of "ceph df" is wrong Resolved

History

#1 Updated by Kefu Chai 12 months ago

  • Backport changed from jewel to hammer, jewel

#2 Updated by Kefu Chai 12 months ago

  • Subject changed from the %USED "ceph df" is wrong to the %USED of "ceph df" is wrong
  • Status changed from In Progress to Need Review

#3 Updated by Yuri Weinstein 11 months ago

  • Status changed from Need Review to Pending Backport

#4 Updated by Loic Dachary 11 months ago

#5 Updated by Loic Dachary 11 months ago

#6 Updated by Nathan Cutler 8 months ago

  • Status changed from Pending Backport to Resolved
  • Needs Doc set to No

Also available in: Atom PDF