Object striping in librados¶
This is a proposal to add a "striped object" API to librados that would stripe big objects into several smaller objects in the same manner cephfs does it. This would allow to store efficiently objects of any size and bring parallel I/O capabilities to the object store.
- Sebastien Ponce (CERN)
- Giuseppe Lo Presti (CERN)
- Andreas Peters (CERN)
- Sebastien Ponce
The striping would identical to the one already used in the cephFS part with the same definition of stripe units and objects set
Detailed Description¶The current librados allows to store objects of any size into Rados, but has a few annoying limitations :
- objects have maximum size (of the order of few MBs by default)
- big objects (think 10s of GB) would anyway impact the cluster efficiency
- rados objects cannot benefit from parallel I/O. For this, you need to use the Ceph Object Storage API
The proposal is to implement these missing features by introducing a striped object concept to the Rados API.
Striped objects have the same API as regular objects, however they are striped and actually stored into a number of regular objects.
- an object size : the maximum object size of the stored objects
- a stripe width : the size of a stripe unit
- a stripe count : the number of stripe units read/written concurrently, aka the number of stripe units in a stripe, aka the number of objects in an object set
- they would have a name derived from the one of the striped object, by appending '###<object number>'
- the first object (suffix '###0') would have 3 entries in its metadata storing the striping parameters (object size, stripe width and stripe count)
As they are regular objects, they could as well by read independently.
- Implemented striped object API in librados
Build / release tasks¶
- Merge new code into librados
- Add the new API to the documentation