Bug #20870
closedOSD compression: incorrect display of the used disk space
0%
Description
Hi,
I tested bluestore OSD compression with:
/etc/ceph/ceph.conf
bluestore_compression_mode = aggressive bluestore_compression_algorithm = lz4
and
ceph osd pool set rbd compression_algorithm snappy ceph osd pool set rbd compression_mode aggresive ceph osd pool set rbd compression_required_ratio 0.2
All pools are using a replication of 3.
I got following output of ceph df:
GLOBAL: SIZE AVAIL RAW USED %RAW USED 46560G 44726G 1833G 3.94 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS rbd 0 675G 4.65 13859G 173627 .rgw.root 1 1077 0 13859G 4 default.rgw.control 2 0 0 13859G 8 default.rgw.meta 3 0 0 13859G 0 default.rgw.log 4 0 0 13859G 287 cephfs_metadata 5 47592k 0 13859G 34 cephfs_data 6 289M 0 13859G 80
Cluster was healthy. (no PG partially synced)
675G*3 should give a used RAW size of 2025G and not 1833G
Same here (another example):
GLOBAL: SIZE AVAIL RAW USED %RAW USED 46560G 44058G 2501G 5.37 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS rbd 0 675G 4.75 13525G 173627 .rgw.root 1 1077 0 13525G 4 default.rgw.control 2 0 0 13525G 8 default.rgw.meta 3 0 0 13525G 0 default.rgw.log 4 0 0 13525G 287 cephfs_metadata 5 44661k 0 13525G 79 cephfs_data 6 329G 2.38 13525G 87369
Same here, (675G + 329G)*3 should give 3012G RAW used space and not 2501G.
I played with that "feature" to calculate the compression win I got using snappy, but I guess the output of ceph df should be adapted with the real space used.
Or maybe one new column should be added with the real/compressed space usage?
(And a maybe second one with the compression win percentage :D )
Many thanks!
Updated by Sage Weil over 6 years ago
The per-pool USED is the logical user data written (before compression). The RAW USED space is the actual on-disk space consumed (after compression). That's why they vary.
There is some other metadata that is not shown here that exposes some of the compression stats, but it isn't exposed currently (and i'm not sure it's in per-pool form). We can open a feature ticket to expose it, but AFAICS the output you show is correct (no bug)...?
Updated by François Blondel over 6 years ago
Exactly!
Theses ceph df outputs are correct (so it's not a Bug), but would need to be more precise.
I could imagine something like:
GLOBAL: SIZE AVAIL RAW USED %RAW USED 46560G 44058G 2501G 5.37 POOLS: NAME ID USED %USED RAW USED %RAW USED MAX AVAIL OBJECTS rbd 0 675G 4.75 xx xx 13525G 173627 .rgw.root 1 1077 0 0 0 13525G 4 default.rgw.control 2 0 0 0 0 13525G 8 default.rgw.meta 3 0 0 0 0 13525G 0 default.rgw.log 4 0 0 0 0 13525G 287 cephfs_metadata 5 44661k 0 0 0 13525G 79 cephfs_data 6 329G 2.38 xx xx 13525G 87369
Should I create a new ticket for that feature request, or should I transform that one into a Feature Request?
Many thanks!
Updated by François Blondel over 6 years ago
Hi,
I just discovered "ceph df detail"
GLOBAL: SIZE AVAIL RAW USED %RAW USED OBJECTS 46560G 43993G 2566G 5.51 255k POOLS: NAME ID QUOTA OBJECTS QUOTA BYTES USED %USED MAX AVAIL OBJECTS DIRTY READ WRITE RAW USED rbd 0 N/A N/A 675G 4.77 13486G 173627 169k 9001k 692k 2025G .rgw.root 1 N/A N/A 1077 0 13486G 4 4 1170 4 3231 default.rgw.control 2 N/A N/A 0 0 13486G 8 8 0 0 0 default.rgw.meta 3 N/A N/A 0 0 13486G 0 0 0 0 0 default.rgw.log 4 N/A N/A 0 0 13486G 287 287 8488k 5658k 0 cephfs_metadata 5 N/A N/A 44696k 0 13486G 79 79 266 397k 130M cephfs_data 6 N/A N/A 329G 2.39 13486G 87369 87369 108k 545k 988G default.rgw.reshard 7 N/A N/A 0 0 13486G 16 16 260k 173k 0
The sum of all the RAW USED values does not match the GLOBAL RAW USED value, so i guess we still have a bug there :(
Updated by Sage Weil about 6 years ago
- Status changed from New to In Progress
- Assignee set to Igor Fedotov
The problem is that currently the RAW USED stats is just USED * (replications or ec factor).
Igor is working on a set of changes that added accoutning for actual disk space consumed on a per-pool basis: https://github.com/ceph/ceph/pull/19454
Updated by Lei Liu over 5 years ago
Sage Weil wrote:
The problem is that currently the RAW USED stats is just USED * (replications or ec factor).
Igor is working on a set of changes that added accoutning for actual disk space consumed on a per-pool basis: https://github.com/ceph/ceph/pull/19454
Hi Sage:
Can we backport this pr to luminous branch ?
Updated by Kefu Chai over 5 years ago
- Status changed from In Progress to Pending Backport
https://github.com/ceph/ceph/pull/19454
i am not sure if we are able to backport PR 19454 to luminous, as it looks a massive one. and is not strictly backward compatible. see its PendingReleaseNotes change. if some of us have bandwidth to do the minimal backport to address the
The problem is that currently the RAW USED stats is just USED * (replications or ec factor).
issue, that'd be great. i am marking this issue "pending backport", but please feel free to change it to "resolved" if we believe it's too risky to backport it or it isn't worth the time and effort to do the backport.
Updated by Nathan Cutler over 5 years ago
- Copied to Backport #37564: mimic: OSD compression: incorrect display of the used disk space added
Updated by Nathan Cutler over 5 years ago
- Copied to Backport #37565: luminous: OSD compression: incorrect display of the used disk space added
Updated by Nathan Cutler about 3 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".