Rados redirects


A redirect is like a symlink: it will transparently redirect librados clients from one object to another. The main anticipated use-case is to redirect to the same object name in a different pool on a different tier of hardware (e.g., demote cold object from SSD to HDD).
This will be useful for existing clusters with multiple hardware types, and in the future for things like erasure coded PGs.


  • Sage Weil (Inktank)

Interested Parties

  • Name (Affiliation)
  • Name (Affiliation)
  • Name

Current Status

There is nothing like this presenting in rados (server or client)
Detailed Description
On the OSD, the object has special metadata that marks it as a redirect (much like symlink != file), and points to another pool/object.
On the client (Objecter/librados), an operation that encounters a redirect (red or write) will resend the request to the new location.
A new rados operation will atomically create a redirect. Normally it would be a part of a transaction that verifies the existing object is the version we expect (the one we copied to cold storage), removes the old object, and creates the redirect. The process of demoting cold objects would be racy but safe and could be driven by some external agent.
Promoting an object is trickier. It may require the OSD to do it (blocking IO to the object). It would be useful to promote on first write or other events. Or just explicitly via some rados operation (if some external agent is handling the tiering).

Work items

Coding tasks

  1. create redirect types on OSD
  2. extend protocol to allow OSD to send objecter to new location
  3. add rados operations to create a redirect
  4. [optional] add OSD support to safely demote an object, either by blocking IO or by failing with EAGAIN if a conflicting update comes in
  5. [optional] add OSD support to safetly promote an object
  6. build stress test tool, or extend RadosModel accordingly