Project

General

Profile

Bug #23547

compression ratio depends on block size, which is much smaller (16K vs 4M) in multisite sync

Added by Casey Bodley almost 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
compression multisite
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Compressors will add a block header to each buffer that we pass to compress(), meaning that the overall compression ratio depends on the block size of its input. Rgw also stores a user.rgw.compression attribute with the object, which is an array of these blocks for mapping virtual offsets to compressed offsets.

In RGWPutObj, these buffers are rgw_obj_stripe_size=4M by default, which results in good compression. In multisite sync, we compress buffers as they come in from libcurl, which defaults to 16k blocks - this results in a significant size overhead - both in the actual compression ratio, and the size of the compression_block array stored in the user.rgw.compression attribute.


Related issues

Copied to rgw - Backport #23864: luminous: compression ratio depends on block size, which is much smaller (16K vs 4M) in multisite sync Resolved

History

#1 Updated by Casey Bodley almost 6 years ago

I was unable to increase libcurl's buffer size above 16k. This issue was raised at https://github.com/curl/curl/issues/2372.

It sounds like we'll need some internal buffering in the compression filter to batch these into 4M blocks. Given a bufferlist with lots of 16k buffers, the compressors will still add a header for each 16k buffer - so we may need to make the compressors smarter about this as well.

#2 Updated by Casey Bodley almost 6 years ago

  • Status changed from In Progress to Fix Under Review

#3 Updated by Casey Bodley almost 6 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport changed from jewel luminous to luminous

#4 Updated by Nathan Cutler almost 6 years ago

  • Copied to Backport #23864: luminous: compression ratio depends on block size, which is much smaller (16K vs 4M) in multisite sync added

#5 Updated by Nathan Cutler almost 6 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF