Bug #24512: Raw used space leak - CephFS - Ceph

Actions

Copy link

Bug #24512

open

Raw used space leak

Added by Thomas De Maet almost 6 years ago. Updated about 5 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hello

I'm testing an setup of cephfs over a EC pool with 21 data + 3 coding chunks ([EC_]stripe_unit of 16k).
All OSD are bluestore on 1 HDD along with WAL&DB.

I'm experiencing unexpected usage of raw space (more than 200% instead of theoretical 24/21=114%).

I've run a lot of copy tests trying identifying the issue.

Firstly, the initial situation (see file "global_stats.txt"): I have 15T in the global, but two pools with only 7.8T and 1G respectively.

From what I dig in doc + mailing list, the space can be lost with:
- WAL+DB: expecting max 1.5GB/osd on 80 osds: 120GB (observed after deletion of all cephfs data: 91GB)
- unfilled stripes of min. "[pool_]obj_stripe", max loss possible: 240k files * 1344kB = 315GB

=> we can expect up to 435GB more, but 7207GB are observed !

What I'm expecting from EC:
- unfilled EC stripes of min. 21*[EC_]stripe_unit = 21*16kB = 336kB
=> no loss possible if [pool_]obj_stripe of 1344kB = 4*336kB , as each full pool_stripe is stored on 4 full EC_stripes
=> rem: with the base 4MB, we should have 4096÷336=12.19 => 13/12.19*24/21 = 121.9% (instead of 114.3%, or increase of ~7%)
What I'm expecting from bluestore:
- some additional DB stuff (keys, index, checksums)... But mostly negligible if large data ?
- some block alignement optims causing fragmentation ?

Secondly, I run bunch of tests with cp, rsync and dd (see copy_tests.txt).

It can be seen that when copying with packets of 1M (what cp is doing), the usage is more than 200% of the original files. That decreases down to 130% when the packet is equal to the object size and increases up to 450% when it is small (128k).

I dont know what is the origin (fragmentation?), but EC loose a large part of its purpose here: gaining space.

1) is it the expected behavior ? (appart the fact that you dont propose such design for EC) What did I miss ?

2) if it is due to bluestore fragmentation, is it possible to design a "defragmenter" in the future to get back the unused space ?

Thanks

Files

Download all files

global_stats.txt (1015 Bytes) global_stats.txt		Thomas De Maet, 06/13/2018 01:39 PM
copy_tests.txt (2.13 KB) copy_tests.txt		Thomas De Maet, 06/13/2018 01:39 PM
osd_0_asok (26.7 KB) osd_0_asok		Thomas De Maet, 06/19/2018 12:25 PM
osd_30_asok (26.7 KB) osd_30_asok		Thomas De Maet, 06/19/2018 12:25 PM
osd_53_asok (26.4 KB) osd_53_asok		Thomas De Maet, 06/19/2018 12:25 PM
osd_77_asok (26.6 KB) osd_77_asok		Thomas De Maet, 06/19/2018 12:25 PM
osd_df (8.28 KB) osd_df		Thomas De Maet, 06/19/2018 12:25 PM
fs_pool_osd.txt (46.3 KB) fs_pool_osd.txt		Thomas De Maet, 06/20/2018 07:51 AM

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #24512

Raw used space leak

Updated by Thomas De Maet almost 6 years ago

Updated by Igor Fedotov almost 6 years ago

Updated by Thomas De Maet almost 6 years ago

Updated by Igor Fedotov almost 6 years ago

Updated by Igor Fedotov almost 6 years ago

Updated by Thomas De Maet almost 6 years ago

Updated by Thomas De Maet almost 6 years ago

Updated by Patrick Donnelly about 5 years ago