Rados cache tier promotion queue and throttling


Sage Weil (Red Hat)
Mark Nelson (Red Hat)

Interested parties

Mingxin Liu (UbuntuKylin)


Each cache miss that results in the promotion of an object from teh base tier to the cache tier will trigger a lot of load on the pool -- 3x writes and the associated promotion overhead. Recent analysis has shown that this is a significant amount of load and that it makes the overall tiered performance extremely sensitive to the miss rate. If the promotion rate gets too high, the performance of the cache plummets and things are way slower than when we started.

A simple experiment limited promotion to some small fraction of the misses (e.g, .1%). When this was done, the performance improved significantly, and was (yay) better than the base tier alone. (For a read workload, we can always choose not to promote by proxying the read to the base tier).

The proxy overhead is pretty small, so this gives us quite a bit of flexibility.


When we decide we should promote (e.g., because the object appears in the last N hitsets), put the object in a queue. Structure that queue as an MRU (most recently used) list.

Trigger async promotions from the head of that list.

Throttle the number of promotions in flight and/or based on an absolute cap on objects/sec or bytes/sec.

Once proxy-write is in place, we can do teh same thing for writes as well--when the base tier allows it. (If the base tier is EC than many writes cannot be proxied.)