Project

General

Profile

Actions

Feature #8473

closed

rgw: Shard bucket index objects to improve single bucket PUT throughput

Added by Guang Yang almost 10 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
% Done:

20%

Source:
Community (dev)
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

There was a blueprint talking about the bucket index scalability issue - https://wiki.ceph.com/Planning/Sideboard/rgw:_bucket_index_scalability

In order to improve the scalability, there were a couple of options mentioned:
1. Use blink bucket, basically disable bucket indexing. With this, there is not bottleneck for single bucket PUT anymore, however, the bucket listing functionality was lost.
2. Shard bucket objects, by sharding bucket index to X objects, the throughput can be improved to X times (expected).

There was also a conversation talking about this recently - http://thread.gmane.org/gmane.comp.file-systems.ceph.devel/19609

I am prototyping the change, and to begin with, I would like to list two basic options we can approach:
1. Use a radosgw configuration as a static configuration, every bucket is sharded in the same way.
Pros:
1. Keep user transparent, the way how to shard is an implementation detail and should not expose to users (as comparing to option 2).
Cons:
1. The load for each bucket might not be equal, and use a single value for all use case might not be appropriate.
2. Do per bucket configuration (e.g. disable bucket, num or shards, etc), this requires to modify the create bucket API to include a new parameter.
Pros:
1. Fine-grain control the bucket sharding
Cons:
1. User should be aware of the the implementation detail and do the selection at the beginning (if we don't have good scale out strategy).

Please help to review and comment.
Thanks

Actions #1

Updated by Neil Levine over 9 years ago

  • Subject changed from [rgw] Shard bucket index objects to improve single bucket PUT throughput to rgw: Shard bucket index objects to improve single bucket PUT throughput
Actions #2

Updated by Ian Colle over 9 years ago

  • Target version set to v0.85
Actions #3

Updated by Ian Colle over 9 years ago

  • Status changed from New to In Progress
Actions #4

Updated by Guang Yang over 9 years ago

  • % Done changed from 0 to 20

Here is the first patch - https://github.com/ceph/ceph/pull/2187

Actions #5

Updated by Ian Colle over 9 years ago

  • Target version changed from v0.85 to v0.86
Actions #6

Updated by Ian Colle over 9 years ago

  • Target version changed from v0.86 to v0.88
Actions #7

Updated by Ian Colle over 9 years ago

  • Target version changed from v0.88 to v0.89
Actions #8

Updated by Sage Weil over 9 years ago

  • Target version changed from v0.89 to v0.90
Actions #9

Updated by Neil Levine over 9 years ago

  • Target version changed from v0.90 to v0.91
Actions #10

Updated by Sage Weil over 9 years ago

  • Target version changed from v0.91 to v0.92
Actions #11

Updated by Neil Levine over 9 years ago

  • Target version changed from v0.92 to v0.93 - Last Hammer Sprint
Actions #12

Updated by Neil Levine over 9 years ago

  • Target version changed from v0.93 - Last Hammer Sprint to v0.92
Actions #13

Updated by Yehuda Sadeh about 9 years ago

  • Status changed from In Progress to Resolved

merged at commit:0dca07379b0fd0cda5e84a861257cf9a741f4900

Actions

Also available in: Atom PDF