Project

General

Profile

Actions

Bug #20870

closed

OSD compression: incorrect display of the used disk space

Added by François Blondel over 6 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
mimic, luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,
I tested bluestore OSD compression with:

/etc/ceph/ceph.conf

bluestore_compression_mode = aggressive
bluestore_compression_algorithm = lz4

and

ceph osd pool set rbd compression_algorithm snappy
ceph osd pool set rbd compression_mode aggresive
ceph osd pool set rbd  compression_required_ratio 0.2

All pools are using a replication of 3.
I got following output of ceph df:

GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED
    46560G     44726G        1833G          3.94
POOLS:
    NAME                    ID     USED       %USED     MAX AVAIL     OBJECTS
    rbd                     0        675G      4.65        13859G      173627
    .rgw.root               1        1077         0        13859G           4
    default.rgw.control     2           0         0        13859G           8
    default.rgw.meta        3           0         0        13859G           0
    default.rgw.log         4           0         0        13859G         287
    cephfs_metadata         5      47592k         0        13859G          34
    cephfs_data             6        289M         0        13859G          80

Cluster was healthy. (no PG partially synced)
675G*3 should give a used RAW size of 2025G and not 1833G

Same here (another example):

GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED
    46560G     44058G        2501G          5.37
POOLS:
    NAME                    ID     USED       %USED     MAX AVAIL     OBJECTS
    rbd                     0        675G      4.75        13525G      173627
    .rgw.root               1        1077         0        13525G           4
    default.rgw.control     2           0         0        13525G           8
    default.rgw.meta        3           0         0        13525G           0
    default.rgw.log         4           0         0        13525G         287
    cephfs_metadata         5      44661k         0        13525G          79
    cephfs_data             6        329G      2.38        13525G       87369

Same here, (675G + 329G)*3 should give 3012G RAW used space and not 2501G.

I played with that "feature" to calculate the compression win I got using snappy, but I guess the output of ceph df should be adapted with the real space used.
Or maybe one new column should be added with the real/compressed space usage?
(And a maybe second one with the compression win percentage :D )

Many thanks!


Related issues 2 (0 open2 closed)

Copied to bluestore - Backport #37564: mimic: OSD compression: incorrect display of the used disk spaceRejectedActions
Copied to bluestore - Backport #37565: luminous: OSD compression: incorrect display of the used disk spaceRejectedActions
Actions #1

Updated by Sage Weil over 6 years ago

The per-pool USED is the logical user data written (before compression). The RAW USED space is the actual on-disk space consumed (after compression). That's why they vary.

There is some other metadata that is not shown here that exposes some of the compression stats, but it isn't exposed currently (and i'm not sure it's in per-pool form). We can open a feature ticket to expose it, but AFAICS the output you show is correct (no bug)...?

Actions #2

Updated by François Blondel over 6 years ago

Exactly!
Theses ceph df outputs are correct (so it's not a Bug), but would need to be more precise.

I could imagine something like:

GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED
    46560G     44058G        2501G          5.37
POOLS:
    NAME                    ID     USED       %USED   RAW USED  %RAW USED    MAX AVAIL     OBJECTS
    rbd                     0        675G      4.75         xx         xx       13525G      173627
    .rgw.root               1        1077         0          0          0       13525G           4
    default.rgw.control     2           0         0          0          0       13525G           8
    default.rgw.meta        3           0         0          0          0       13525G           0
    default.rgw.log         4           0         0          0          0       13525G         287
    cephfs_metadata         5      44661k         0          0          0       13525G          79
    cephfs_data             6        329G      2.38         xx         xx       13525G       87369

Should I create a new ticket for that feature request, or should I transform that one into a Feature Request?

Many thanks!

Actions #3

Updated by François Blondel over 6 years ago

Hi,
I just discovered "ceph df detail"


GLOBAL:
    SIZE       AVAIL      RAW USED     %RAW USED     OBJECTS 
    46560G     43993G        2566G          5.51        255k 
POOLS:
    NAME                    ID     QUOTA OBJECTS     QUOTA BYTES     USED       %USED     MAX AVAIL     OBJECTS     DIRTY     READ      WRITE     RAW USED 
    rbd                     0      N/A               N/A               675G      4.77        13486G      173627      169k     9001k      692k        2025G 
    .rgw.root               1      N/A               N/A               1077         0        13486G           4         4      1170         4         3231 
    default.rgw.control     2      N/A               N/A                  0         0        13486G           8         8         0         0            0 
    default.rgw.meta        3      N/A               N/A                  0         0        13486G           0         0         0         0            0 
    default.rgw.log         4      N/A               N/A                  0         0        13486G         287       287     8488k     5658k            0 
    cephfs_metadata         5      N/A               N/A             44696k         0        13486G          79        79       266      397k         130M 
    cephfs_data             6      N/A               N/A               329G      2.39        13486G       87369     87369      108k      545k         988G 
    default.rgw.reshard     7      N/A               N/A                  0         0        13486G          16        16      260k      173k            0 

The sum of all the RAW USED values does not match the GLOBAL RAW USED value, so i guess we still have a bug there :(

Actions #4

Updated by Greg Farnum over 6 years ago

  • Project changed from RADOS to bluestore
Actions #5

Updated by Sage Weil about 6 years ago

  • Status changed from New to In Progress
  • Assignee set to Igor Fedotov

The problem is that currently the RAW USED stats is just USED * (replications or ec factor).

Igor is working on a set of changes that added accoutning for actual disk space consumed on a per-pool basis: https://github.com/ceph/ceph/pull/19454

Actions #6

Updated by Lei Liu over 5 years ago

Sage Weil wrote:

The problem is that currently the RAW USED stats is just USED * (replications or ec factor).

Igor is working on a set of changes that added accoutning for actual disk space consumed on a per-pool basis: https://github.com/ceph/ceph/pull/19454

Hi Sage:

Can we backport this pr to luminous branch ?

Actions #7

Updated by Kefu Chai over 5 years ago

  • Status changed from In Progress to Pending Backport

https://github.com/ceph/ceph/pull/19454

i am not sure if we are able to backport PR 19454 to luminous, as it looks a massive one. and is not strictly backward compatible. see its PendingReleaseNotes change. if some of us have bandwidth to do the minimal backport to address the

The problem is that currently the RAW USED stats is just USED * (replications or ec factor).

issue, that'd be great. i am marking this issue "pending backport", but please feel free to change it to "resolved" if we believe it's too risky to backport it or it isn't worth the time and effort to do the backport.

Actions #8

Updated by Kefu Chai over 5 years ago

  • Backport set to mimic, luminous
Actions #9

Updated by Nathan Cutler over 5 years ago

  • Copied to Backport #37564: mimic: OSD compression: incorrect display of the used disk space added
Actions #10

Updated by Nathan Cutler over 5 years ago

  • Copied to Backport #37565: luminous: OSD compression: incorrect display of the used disk space added
Actions #11

Updated by Nathan Cutler over 4 years ago

  • Pull request ID set to 19454
Actions #12

Updated by Nathan Cutler about 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF