RADOS Gateway refactor into library internal APIs


The current implementation of the RADOS Gateway is as a FastCGI process behind either Apache (preferred), nginx or lighttpd. This has various issues and it would be better to make the RGW a stand-alone HTTP server which can be placed behind a reverse-proxy.


  • Wido den Hollander (42on)

Interested Parties

Current Status

It's currently just an idea and no code has been written yet.

Detailed Description

The idea is to seperate the RADOS Gateway in two parts:
  • librgw
  • rgwhttp
The first part is to move most of the logic into librgw, like:
  • Bucket creation
  • User creation
  • Object read and write operations
  • Authorization

librgw would be C/C++ code which would communicate with librados. librgw will be a part of the main Ceph repository.
On top of librgw Python bindings would be written where the actual HTTP implementation could be done in Python.
Note: It should all be backwards compatible! No changes should be made on the actual bucket and object storage! This new idea should be a drop-in replacement for the current RADOS Gateway. Users should NOT be required to re-upload their data.

Stand-alone webserver (rgwhttp)

In Python a stand-alone webserver could be written which can do more then just talking to librgw.
Note: Using Python Twisted Web is also a viable option since it uses better threading.
The benefit here is that there is no longer a requirement for compiling mod_fastcgi manually and that running Apache or any other HTTP server is required.
By also seperating the logic (librgw) from the HTTP and XML handling the code would become much cleaner and easier to acces.
The rgwhttp server could be a standalone Git repository and changes there won't affect any of Ceph's core code. This would also make contribution for other users much easier.

Reverse HTTP proxy and caching

One of the current problems with the RADOS Gateway is scaling and with that object caching.
When running multiple RGW instances you can't cache objects by default, since a object can be modified via RGW server A where server B is still caching a old version of the object.
Since the RGW is 100% HTTP it would be very beneficial for scaling if a Reverse Proxy like Varnish could be used in front of the RGW. Varnish could cache object until they are purged from it's caches or the TTL expires.
Attached is a possible flow of how the RGW could work.
All RGW servers would be connected to a central Messaging Queue where they submit messages as soon as a object changes through them. The caches could read again from the MQ and purge objects from their caches so you don't have cache inconsistency.
For example RabbitMQ would work here:

Direct Object access

Since all the logic goes into librgw you aren't required anymore to actually use the HTTP gateway to access object stored in the RGW. If for example somebody wants to write a web interface where they can directly access objects they can do so by directly talking to librgw and by-pass the whole HTTP stack.
Maintenance tasks for scanning objects could do the same. If they have direct access to the Ceph cluster they can talk to librgw directly, maybe even CLI tools could be written to modify objects?

Work items