Project

General

Profile

Object striping in librados

Summary

This is a proposal to add a "striped object" API to librados that would stripe big objects into several smaller objects in the same manner cephfs does it. This would allow to store efficiently objects of any size and bring parallel I/O capabilities to the object store.

Owners

  • Sebastien Ponce (CERN)
  • Giuseppe Lo Presti (CERN)
  • Andreas Peters (CERN)

Interested Parties

  • Sebastien Ponce

Current Status

The striping would identical to the one already used in the cephFS part with the same definition of stripe units and objects set

Detailed Description

The current librados allows to store objects of any size into Rados, but has a few annoying limitations :
  • objects have maximum size (of the order of few MBs by default)
  • big objects (think 10s of GB) would anyway impact the cluster efficiency
  • rados objects cannot benefit from parallel I/O. For this, you need to use the Ceph Object Storage API

The proposal is to implement these missing features by introducing a striped object concept to the Rados API.
Striped objects have the same API as regular objects, however they are striped and actually stored into a number of regular objects.

The striping algorithm is the one already used in ceph and described in details in the ceph architecture (http://ceph.com/docs/master/architecture/, part on data striping). It defines three parameters for striping an object :
  • an object size : the maximum object size of the stored objects
  • a stripe width : the size of a stripe unit
  • a stripe count : the number of stripe units read/written concurrently, aka the number of stripe units in a stripe, aka the number of objects in an object set
Practically, the regular objects produced by the striping of a striped object would have a few special characteristics allowing to read back the striped object easily :
  • they would have a name derived from the one of the striped object, by appending '###<object number>'
  • the first object (suffix '###0') would have 3 entries in its metadata storing the striping parameters (object size, stripe width and stripe count)

As they are regular objects, they could as well by read independently.

Work items

Coding tasks

  1. Implemented striped object API in librados

Build / release tasks

  1. Merge new code into librados

Documentation tasks

  1. Add the new API to the documentation

Deprecation tasks