1E - RADOS Gateway refactor into library internal APIs

Live Pad

The live pad can be found here: [pad]

Summit Snapshot

Coding tasks
  1. [Wido] Draft librgw API (informed by current rgw, and APIs of initial users that will utilize it... e.g. apache module interface, nginx plugin api, etc.)
  2. > Highest level API, where you just do GET and PUT
  3. [Yehuda?] Refactor rgw as need to match that interface
  4. [Gary] Repackage rgw bits into librgw (C++, C)
  5. > Lower level API for listing buckets, creating objects, etc, etc
  6. [Wido?] Python bindings for librgw
  7. Write rgwhttp (Python? Twisted Web server?)
  8. Write a MQ reader for Varnish (or any other caching reverse proxy)
Build / release tasks
  1. Package librgw
  2. Package rgwhttp (Own Git repository?)
  3. Package the MQ reader
Documentation tasks
  1. Document the API of librgw
  2. Document rgwhttp
  3. Write new documentation on how to set everything up
Deprecation tasks
  1. Remove RGW from the main repository
  2. Remove any old documentation
Other use cases:
  1. storing large objects with atomic replacement, collection/bucket indeces
  2. object expiration agents? (instead of embedding in radosgw daemon)
  3. librgw could provide a migration path from Swift to native RADOS access without proxy server/gateway bottlenecks
OT: other S3/Swift API features we are missing:
  1. versioning
  2. object expiration (delete after, delete at)
  3. Swift temp URLs, #3454
  4. temporary security credentials

1. Why integrate with Varnish specifically? This doesn't seem very Ceph-related. It's basically an entirely different layer.
For example, we have multiple layers of Varnish in our infrastructure in front of Apache/radosgw and we handle purges already via pre-existing mechanisms (multicast HTCP).
I'm guessing most people have their own way of handling this (CDNs etc.). For some people this might just mean TTLs, others might care for instant purges.

-> Having a "hooks" system where radosgw invokes something e.g. on PUT/POST/DELETE operations could potentially solve this use case.

  • -> We probably don't want this in librgw in the beginning, maybe in a later version. Could be handled by the HTTP server as well.
  • perhaps a unix socket or fork/exec

2. Python isn't very good for performance and it's one of Swift's bottlenecks. This would seem a step backwards.
-> It could be any language when librgw exists

3. Would it make sense to support WSGI middlewares in the Python webserver to provide an easier migration path from Swift?