Project

General

Profile

Actions

Feature #41417

closed

rgw: store small object's data part into xattr to avoid disk space wasting

Added by Honggang Yang over 4 years ago. Updated almost 4 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

In the following test, I store 100000 1K/2K/4K/8K/16K/32K/64K object into newly created ceph cluster.
And record their space usage info. As we can see, a lot of disk space is wasted. In the 1K object case,
100MB small object consume at least 6.1GB disk space.

>>>>> 1K:
[root@um14 scripts]# rados df
POOL_NAME                     USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                  256 KiB       4      0      4                  0       0        0     36  36 KiB      4   4 KiB        0 B         0 B
default.rgw.buckets.data   6.1 GiB  100317      0 100317                  0       0        0   1740 1.3 MiB 907203  98 MiB        0 B         0 B
default.rgw.buckets.index      0 B       1      0      1                  0       0        0 604577 591 MiB 302257 197 MiB        0 B         0 B
default.rgw.control            0 B       8      0      8                  0       0        0      0     0 B      0     0 B        0 B         0 B
default.rgw.log                0 B     175      0    175                  0       0        0   4770 4.5 MiB   3814 634 KiB        0 B         0 B
default.rgw.meta           256 KiB       5      0      5                  0       0        0     94  77 KiB     29  14 KiB        0 B         0 B
defaults.rgw.buckets.data      0 B       0      0      0                  0       0        0      0     0 B      0     0 B        0 B         0 B
defaults.rgw.buckets.index     0 B       0      0      0                  0       0        0      0     0 B      0     0 B        0 B         0 B

total_objects    100510
total_used       7.1 GiB
total_avail      551 GiB
total_space      558 GiB

[root@um14 scripts]# ceph osd df
ID CLASS WEIGHT  REWEIGHT SIZE    RAW USE DATA    OMAP   META     AVAIL   %USE VAR  PGS STATUS
 0   hdd 0.54489  1.00000 558 GiB 7.1 GiB 6.1 GiB 21 MiB 1003 MiB 551 GiB 1.28 1.00 432     up
                    TOTAL 558 GiB 7.1 GiB 6.1 GiB 21 MiB 1003 MiB 551 GiB 1.28                 
MIN/MAX VAR: 1.00/1.00  STDDEV: 0

...

>>>>> 4K:

[root@um14 scripts]# rados df
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 384 KiB       6      0      6                  0       0        0     30  30 KiB      6   6 KiB        0 B         0 B
default.rgw.buckets.data  6.1 GiB  100001      0 100001                  0       0        0      2   1 KiB 900012 391 MiB        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 600049 587 MiB 300006 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B      0     0 B        0 B         0 B
default.rgw.log               0 B     207      0    207                  0       0        0   2388 2.1 MiB   1530   2 KiB        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     44  38 KiB     26  13 KiB        0 B         0 B

total_objects    100228
total_used       7.1 GiB
total_avail      551 GiB
total_space      558 GiB

>>>>> 8K:

[root@um14 scripts]# rados df
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB      4   4 KiB        0 B         0 B
default.rgw.buckets.data  6.1 GiB  100000      0 100000                  0       0        0      0     0 B 900000 781 MiB        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 600030 586 MiB 300001 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B      0     0 B        0 B         0 B
default.rgw.log               0 B     207      0    207                  0       0        0   2369 2.1 MiB   1558     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     46  40 KiB     28  14 KiB        0 B         0 B

total_objects    100225
total_used       7.1 GiB
total_avail      551 GiB
total_space      558 GiB

>>>>>> 16K:

+ rados df                                                                                          
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB      4   4 KiB        0 B         0 B
default.rgw.buckets.data  6.1 GiB   99803      0  99803                  0       0        0      0     0 B 898227 1.5 GiB        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 598853 585 MiB 299411 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B      0     0 B        0 B         0 B
default.rgw.log               0 B     175      0    175                  0       0        0   1383 1.2 MiB    922     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     46  40 KiB     28  14 KiB        0 B         0 B

total_objects    99996                                                                              
total_used       7.1 GiB                                                                            
total_avail      551 GiB                                                                            
total_space      558 GiB

>>>>> 32K:

+ rados df                                                                                          
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB      4   4 KiB        0 B         0 B
default.rgw.buckets.data  6.1 GiB   99700      0  99700                  0       0        0      0     0 B 897300 3.0 GiB        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 598229 584 MiB 299099 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B      0     0 B        0 B         0 B
default.rgw.log               0 B     175      0    175                  0       0        0   1383 1.2 MiB    922     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     46  40 KiB     28  14 KiB        0 B         0 B

total_objects    99893                                                                              
total_used       7.1 GiB                                                                            
total_avail      551 GiB                                                                            
total_space      558 GiB

>>>>> 64K:

+ rados df                                                                                          
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB      4   4 KiB        0 B         0 B
default.rgw.buckets.data  6.1 GiB   99845      0  99845                  0       0        0      0     0 B 898605 6.1 GiB        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 599120 585 MiB 299536 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B      0     0 B        0 B         0 B
default.rgw.log               0 B     175      0    175                  0       0        0   2241 2.0 MiB   1494     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     87  74 KiB     40  20 KiB        0 B         0 B

total_objects    100038                                                                             
total_used       7.1 GiB                                                                            
total_avail      551 GiB                                                                            
total_space      558 GiB      

In order to resolve this problem, we can store small object into rados object's xattr part,
the db of osd will convert these small kv into continuous large blocks, thus reducing the
waste of disk space.


Related issues 2 (1 open1 closed)

Related to RADOS - Bug #41577: Erasure-Coded storage in bluestore has larger disk usage than expectedNew

Actions
Related to bluestore - Bug #44213: Erasure coded pool might need much more disk space than expectedResolved

Actions
Actions #2

Updated by Honggang Yang over 4 years ago

The following is the test result after setting rgw_inline_limit_bytes to 64K. Obviously rados df has a problem with xattr's statistics, but we can see from the total_used part that this patch greatly reduces the waste of disk space.

1K:

+ rados df                                                                                         
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD  WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB       4   4 KiB        0 B         0 B
default.rgw.buckets.data      0 B  100000      0 100000                  0       0        0      0     0 B 1000000     0 B        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 600026 586 MiB  300001 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B       0     0 B        0 B         0 B
default.rgw.log               0 B     175      0    175                  0       0        0   1383 1.2 MiB     922     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     42  36 KiB      24  12 KiB        0 B         0 B

total_objects    100193                                                                             
total_used       1.0 GiB                                                                            
total_avail      557 GiB                                                                            
total_space      558 GiB  

2K:

+ rados df                                                                                          
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB      4   4 KiB        0 B         0 B
default.rgw.buckets.data      0 B   99945      0  99945                  0       0        0      0     0 B 999450     0 B        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 599693 586 MiB 299834 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B      0     0 B        0 B         0 B
default.rgw.log               0 B     175      0    175                  0       0        0   1383 1.2 MiB    922     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     42  36 KiB     24  12 KiB        0 B         0 B

total_objects    100138                                                                             
total_used       1.0 GiB                                                                            
total_avail      557 GiB                                                                            
total_space      558 GiB

4K:

 rados df                                                                                          
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD  WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB       4   4 KiB        0 B         0 B
default.rgw.buckets.data      0 B  100000      0 100000                  0       0        0      0     0 B 1000000     0 B        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 600026 586 MiB  300001 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B       0     0 B        0 B         0 B
default.rgw.log               0 B     175      0    175                  0       0        0   1383 1.2 MiB     922     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     42  36 KiB      24  12 KiB        0 B         0 B

total_objects    100193                                                                             
total_used       1.0 GiB                                                                            
total_avail      557 GiB                                                                            
total_space      558 GiB

8K:

+ rados df                                                                                          
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB      4   4 KiB        0 B         0 B
default.rgw.buckets.data      0 B   99813      0  99813                  0       0        0      0     0 B 998130     0 B        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 598907 585 MiB 299441 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B      0     0 B        0 B         0 B
default.rgw.log               0 B     175      0    175                  0       0        0   1383 1.2 MiB    922     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     42  36 KiB     24  12 KiB        0 B         0 B

total_objects    100006                                                                             
total_used       1.4 GiB                                                                            
total_avail      557 GiB                                                                            
total_space      558 GiB 

16K:

+ rados df                                                                                          
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD  WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB       4   4 KiB        0 B         0 B
default.rgw.buckets.data      0 B  100000      0 100000                  0       0        0      0     0 B 1000000     0 B        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 600030 586 MiB  300001 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B       0     0 B        0 B         0 B
default.rgw.log               0 B     175      0    175                  0       0        0   1383 1.2 MiB     922     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     44  38 KiB      26  13 KiB        0 B         0 B

total_objects    100193                                                                             
total_used       2.3 GiB                                                                            
total_avail      556 GiB                                                                            
total_space      558 GiB 

32K:
+ rados df                                                                                          
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB      4   4 KiB        0 B         0 B
default.rgw.buckets.data      0 B   99873      0  99873                  0       0        0      5  66 KiB 998730     0 B        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 599274 586 MiB 299620 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B      0     0 B        0 B         0 B
default.rgw.log               0 B     175      0    175                  0       0        0   1383 1.2 MiB    922     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     46  40 KiB     28  14 KiB        0 B         0 B

total_objects    100066                                                                             
total_used       3.6 GiB                                                                            
total_avail      554 GiB                                                                            
total_space      558 GiB

64K:

+ rados df                                                                                          
POOL_NAME                    USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD  WR_OPS      WR USED COMPR UNDER COMPR
.rgw.root                 256 KiB       4      0      4                  0       0        0     13  13 KiB       4   4 KiB        0 B         0 B
default.rgw.buckets.data      0 B  100000      0 100000                  0       0        0      0     0 B 1000000     0 B        0 B         0 B
default.rgw.buckets.index     0 B       1      0      1                  0       0        0 600032 586 MiB  300001 195 MiB        0 B         0 B
default.rgw.control           0 B       8      0      8                  0       0        0      0     0 B       0     0 B        0 B         0 B
default.rgw.log               0 B     175      0    175                  0       0        0   1383 1.2 MiB     922     0 B        0 B         0 B
default.rgw.meta          256 KiB       5      0      5                  0       0        0     46  40 KiB      28  14 KiB        0 B         0 B

total_objects    100193                                                                             
total_used       6.8 GiB                                                                            
total_avail      551 GiB                                                                            
total_space      558 GiB 
Actions #3

Updated by Honggang Yang over 4 years ago

The disk space statistics of a newly created cluster when no user data is written are as follows:

[root@um14 ceph]# rados df
POOL_NAME                     USED OBJECTS CLONES COPIES MISSING_ON_PRIMARY UNFOUND DEGRADED RD_OPS      RD WR_OPS    WR USED COMPR UNDER COMPR
.rgw.root                  256 KiB       4      0      4                  0       0        0      0     0 B      4 4 KiB        0 B         0 B
default.rgw.control            0 B       8      0      8                  0       0        0      0     0 B      0   0 B        0 B         0 B
default.rgw.log                0 B     175      0    175                  0       0        0   1575 1.4 MiB   1050   0 B        0 B         0 B
default.rgw.meta               0 B       0      0      0                  0       0        0      0     0 B      0   0 B        0 B         0 B
defaults.rgw.buckets.data      0 B       0      0      0                  0       0        0      0     0 B      0   0 B        0 B         0 B
defaults.rgw.buckets.index     0 B       0      0      0                  0       0        0      0     0 B      0   0 B        0 B         0 B

total_objects    187
total_used       1.0 GiB
total_avail      557 GiB
total_space      558 GiB

The total_used is 1GB. This part may be used for db.slow.

The label information for my bluestore is as follows:

# ceph-bluestore-tool --command show-label --path /var/lib/ceph/osd/ceph-0/
inferring bluefs devices from bluestore path
{
    "/var/lib/ceph/osd/ceph-0/block": {
        "osd_uuid": "16d463da-7a05-4db2-9e14-327b0e6ff81c",
        "size": 599147937792,
        "btime": "2019-08-24T17:13:20.873519+0800",
        "description": "main",
        "bluefs": "1",
        "ceph_fsid": "6a9f4f54-32fd-4321-ba99-9ce84fa5a3af",
        "kv_backend": "rocksdb",
        "magic": "ceph osd volume v026",
        "mkfs_done": "yes",
        "osd_key": "AQCt/2Bdc7q0LBAAvQwDC/CNw5+H9pFZqPzfYA==",
        "ready": "ready",
        "require_osd_release": "15",
        "whoami": "0" 
    }
}
Actions #4

Updated by Patrick Donnelly over 4 years ago

  • Project changed from Ceph to rgw
  • Subject changed from store small object's data part into xattr to avoid disk space wasting to rgw: store small object's data part into xattr to avoid disk space wasting
  • Status changed from New to Fix Under Review
  • Start date deleted (08/24/2019)
  • Pull request ID set to 29863
Actions #5

Updated by Honggang Yang over 4 years ago

Try to fix this problem in the bluestore layer:
https://github.com/ceph/ceph/pull/30056

Actions #6

Updated by Igor Fedotov almost 4 years ago

  • Related to Bug #41577: Erasure-Coded storage in bluestore has larger disk usage than expected added
Actions #7

Updated by Igor Fedotov almost 4 years ago

  • Related to Bug #44213: Erasure coded pool might need much more disk space than expected added
Actions #8

Updated by Igor Fedotov almost 4 years ago

  • Status changed from Fix Under Review to Rejected

Closing in favor of the fix for https://tracker.ceph.com/issues/44213

Actions

Also available in: Atom PDF