Project

General

Profile

Feature #8473

rgw: Shard bucket index objects to improve single bucket PUT throughput

Added by Guang Yang over 5 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
Start date:
05/29/2014
Due date:
% Done:

20%

Source:
Community (dev)
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

There was a blueprint talking about the bucket index scalability issue - https://wiki.ceph.com/Planning/Sideboard/rgw:_bucket_index_scalability

In order to improve the scalability, there were a couple of options mentioned:
1. Use blink bucket, basically disable bucket indexing. With this, there is not bottleneck for single bucket PUT anymore, however, the bucket listing functionality was lost.
2. Shard bucket objects, by sharding bucket index to X objects, the throughput can be improved to X times (expected).

There was also a conversation talking about this recently - http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/19609

I am prototyping the change, and to begin with, I would like to list two basic options we can approach:
1. Use a radosgw configuration as a static configuration, every bucket is sharded in the same way.
Pros:
1. Keep user transparent, the way how to shard is an implementation detail and should not expose to users (as comparing to option 2).
Cons:
1. The load for each bucket might not be equal, and use a single value for all use case might not be appropriate.
2. Do per bucket configuration (e.g. disable bucket, num or shards, etc), this requires to modify the create bucket API to include a new parameter.
Pros:
1. Fine-grain control the bucket sharding
Cons:
1. User should be aware of the the implementation detail and do the selection at the beginning (if we don't have good scale out strategy).

Please help to review and comment.
Thanks

History

#1 Updated by Neil Levine about 5 years ago

  • Subject changed from [rgw] Shard bucket index objects to improve single bucket PUT throughput to rgw: Shard bucket index objects to improve single bucket PUT throughput

#2 Updated by Ian Colle about 5 years ago

  • Target version set to v0.85

#3 Updated by Ian Colle about 5 years ago

  • Status changed from New to In Progress

#4 Updated by Guang Yang about 5 years ago

  • % Done changed from 0 to 20

Here is the first patch - https://github.com/ceph/ceph/pull/2187

#5 Updated by Ian Colle about 5 years ago

  • Target version changed from v0.85 to v0.86

#6 Updated by Ian Colle almost 5 years ago

  • Target version changed from v0.86 to v0.88

#7 Updated by Ian Colle almost 5 years ago

  • Target version changed from v0.88 to v0.89

#8 Updated by Sage Weil almost 5 years ago

  • Target version changed from v0.89 to v0.90

#9 Updated by Neil Levine almost 5 years ago

  • Target version changed from v0.90 to v0.91

#10 Updated by Sage Weil almost 5 years ago

  • Target version changed from v0.91 to v0.92

#11 Updated by Neil Levine over 4 years ago

  • Target version changed from v0.92 to v0.93 - Last Hammer Sprint

#12 Updated by Neil Levine over 4 years ago

  • Target version changed from v0.93 - Last Hammer Sprint to v0.92

#13 Updated by Yehuda Sadeh over 4 years ago

  • Status changed from In Progress to Resolved

merged at commit:0dca07379b0fd0cda5e84a861257cf9a741f4900

Also available in: Atom PDF