Documentation #13092: Cache tiering, clarification on cache miss behaviour - Ceph - Ceph

Actions

Copy link

Documentation #13092

closed

Cache tiering, clarification on cache miss behaviour

Added by Christian Marie over 8 years ago. Updated over 4 years ago.

Status:

Closed

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Tags:

Backport:

Reviewed:

Affected Versions:

Pull request ID:

Description

I'm opening this bug on the request of gregsfortytwo, who was very helpful in a brief discussion on IRC (#ceph-devel).

Referring to http://ceph.com/docs/master/rados/operations/cache-tiering/

This seems to imply that the Objector can request cold data directly from the backing tier. Is this the case?

The other subject I'm not clear on is write requests to a write-back cache. Do the writes get forwarded to the backing tier, does the Objector go directly to the backing tier, or does the data get moved to the cache tier and then the write is applied there?

Below is the full IRC log, including some points to be clarified and some answers:


2:30 < pingu_> Hi, #ceph doesn't seem to have some answers I'd like on cache tiering so I'll ask here.
22:30 < pingu_> With cache tiering -- if you're operating in writeback mode and you modify some data that is currently inactive (residing in cold storage), does the data first get copied to the cache tier and then updated?
22:30 < pingu_> Or is it updated in the overlayed (cold) pool?
22:31 < pingu_> Also, when operating in readonly mode, and you get a cache miss, is the data first shuffled to the cache tier and then served to the requesting client?
22:31 < pingu_> Or is it served directly from the cold tier and scheduled to be promoted based on heuristics?
22:31 < gregsfortytwo> in hammer it's all promoted first
22:32 < pingu_> ah, and that's changing?
22:32 < gregsfortytwo> in infernalis (I think?) the reads can be proxied, and some of the writes can
22:32 < gregsfortytwo> by the cache pool OSD that would serve the data if it were in cache
22:32 < gregsfortytwo> and the data doesn't get promoted unless it gets sufficiently hot
22:32 < pingu_> or unless it's a write, right?
22:33 < gregsfortytwo> I'm not sure how configurable "sufficiently hot" is, but I think it means two regular accesses without the cache daemon wanting to evict the metadata
22:33 < gregsfortytwo> writes will generally require copy-up, but some of them can be proxied as well
22:33 < gregsfortytwo> not sure if that code is in the infernalis code base though; this is all in the RADOS team
22:34 < pingu_> Yeah, pretty sure that's tweakable with the hitset options.
22:34 < pingu_> Whilst I have you're attention.
22:34 < pingu_> Do you know why readonly mode simply says "no writes"?
22:34 < gregsfortytwo> nope
22:34 < pingu_> It seems like it would have been easy to add eventual consistency
22:34 < gregsfortytwo> oh
22:34 < pingu_> but I'm sure there's some corner case there
22:34 < gregsfortytwo> yeah, no, that's not Ceph's model and we really can't implement it
22:35 < gregsfortytwo> none of the data tracking structures or algorithms are capable of eventual consistency
22:35 < gregsfortytwo> in addition to it just being against the Ceph philosophy
22:35 < gregsfortytwo> read-only is read-only
22:35 < pingu_> okay, thanks.
22:35 < gregsfortytwo> you can write to the base pool and the read-only pool will pick it up eventually, that's about as eventual-consistency as it's going to get ;)
22:35 < pingu_> that was most helpful
22:36 < pingu_> gregsfortytwo: sure, that's what I meant actually
22:36 < gregsfortytwo> oh, I *think* you can do that?
22:36 < pingu_> I was just thinking that when you write to the base pool, the read only pools are told to drop those things from their caches
22:36 < pingu_> and you don't have to wait for eviction
22:36 < gregsfortytwo> don't think so
22:37 < pingu_> Yeah, pretty sure you can't currently, but that seemed easy to implement is all and I was wondering why not.
22:37 < gregsfortytwo> but read-only pools aren't as well tested or supported as the more normal ones so I'm not sure exactly what state they're in
22:37 < gregsfortytwo> if not available right now it's probably just because configuring the clients to ignore the caching redirects isn't set up and would be a pain
22:37 < pingu_> okay
22:38 < pingu_> One last thing!
22:38 < pingu_> the "Objecter", is that client-side code?
22:38 < pingu_> as in, there's nothing proxying at the network layer between clients and the various tiers, is there?
22:39 < pingu_> The clients are just directed to the hot or cold tier based on OSD maps of some sort? If it's in the cold tier does the client know to go straight there?
22:39 < pingu_> Or does it always hit the cache tier and then get some kind of a redirect on miss?
22:40 < gregsfortytwo> Objecter is client-side, yes
22:40 < gregsfortytwo> part of the internal library, and it handles redirecting clients to the hot tier
22:40 < gregsfortytwo> clients don't touch the cold tier if there's a caching pool at this time
22:41 < pingu_> Oh. That's a surprise.
22:41 < pingu_> The diagrams are kind of a lie, then.
22:41 < gregsfortytwo> there are discussions about sending clients a redirect to the cold tier in some cases, but the consistency model of doing that is very difficult
22:41 < gregsfortytwo> which diagrams?
22:41 < pingu_> http://ceph.com/docs/master/rados/operations/cache-tiering/
22:42 < pingu_> first one there, has an arrow from Objecter to Storage Tier
22:42 < gregsfortytwo> huh
22:42 < gregsfortytwo> we've certainly planned to enable that
22:42 < gregsfortytwo> maybe it does that right now if you've got a read-only tier and try to write to the base pool
22:43 < gregsfortytwo> or maybe the diagram is just wrong, can you file a doc bug so somebody can check and get it fixed?
22:43 < pingu_> So currently if there's a cache miss, things are just streamed through the cache OSD from the storage one?
22:43 < pingu_> and then later promoted if that's the will of the tiering agent

Actions

Copy link

Updated by Zac Dover over 4 years ago

Status changed from New to Closed

This bug has been judged too old to fix. This is because either it is either 1) raised against a version of Ceph prior to Luminous, or 2) just really old, and untouched for so long that it is unlikely nowadays to represent a live documentation concern.

If you think that the closing of this bug is an error, raise another bug of a similar kind. If you think that the matter requires urgent attention, please let Zac Dover know at zac.dover@gmail.com.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries