Project

General

Profile

Feature #48388

Updated by Ernesto Puerta about 3 years ago

h3. Summary

In order to reduce the chances of mgr modules and, specifically, dashboard to compromise the Ceph cluster performance by an increase in the frequency of API calls (unlike other mgr Modules, Dashboard load is predominantly user-driven), a multi-layered caching approach should be put in place.

h3. Current status

Existing caching approaches and issues:
* Ceph-mgr itself caches a lot of API calls (get_module_option, get, get_server, get_metadata, ), so not every request to the "ceph-mgr API":https://docs.ceph.com/en/latest/mgr/modules/ hits the Ceph cluster. However, the @send_command()@ is not cached and might have a performance impact.
* Additionally, one bottleneck in ceph-mgr is the @PyFormatter@, the class responsible for deserializing C++ binary structs to Python objects. For big objects (osd_map) this deserialization is not negligible, so it might be worthy to explore caching the resulting deserialized Python object or explore an incremental approach that doesn't involve processing the same data time after time.
* Dashboard-backend: "ViewCache":https://github.com/ceph/ceph/pull/20103/commits/349c1ff3d278cacc64e88f17c0edbbbc154ac4d0 decouples REST controller request from ceph-mgr API ones and allows for asynchronous fetching of data.

The following picture shows the existing approaches (ViewCache) and the ones to explore, as a well as other potential points (PyFormatter and non-cached @send_command@):

!Ceph%20Dashboard%20Caching.png!

h3. Proposal

Layers:
* *Ceph-mgr API*:
** C++: this is optimal, as the cached data is shared across modules. However is less trivial to implement.
** Python: "@cachetools@":https://cachetools.readthedocs.io/en/stable/#cachetools.func.ttl_cache. This one could be introduced at per-module level (every module interacts with each version of the cached ceph-mgr API methods) or shared (all modules consume the cached versions of the ceph-mgr API methods, although this could bring issues with modules modifying the objects returned by the cached methods). "@cachetools@":https://cachetools.readthedocs.io/en/stable/#cachetools.func.ttl_cache
* *Dashboard back-end*:
** Python: @cachetools@
** "CherryPy Cache":https://docs.cherrypy.org/en/3.2.6/refman/lib/caching.html (it also takes care of the HTTP caching)
* *Dashboard front-end*:
** HTTP Cache (browser provided?)
** Typescript/JS: "memoize-cache-decorator":https://www.npmjs.com/package/memoize-cache-decorator or "ts-cacheable":https://www.npmjs.com/package/typescript-cacheable

Pros:
* Reduced load in ceph-mgr
* Shorter response times

Cons:
* Increased memory usage
* Stale data (while TTL caches can improve this)
* Data serializations issues
* Leaks/ref counting issues

Back