Bug #8305
closed
objecter, osd: pool overlay change should trigger op resend
Added by Sage Weil almost 10 years ago.
Updated almost 10 years ago.
Description
If the client is sending ops a, b, c, d, and a map is received changing the overlay, ordering can break. For example,
- get map, overlay = cache
...
- send a to cache
- get map, overlay = none
- send b, c, d to base
- get reply b, c, d
- get reply a (redirect)
Instead, osd should discard ops from before the last overlay change, and client should resend.
I think cache mode changes will cause similar problems. Let's add a pg_pool_t epoch_t that indicates the last policy change (whether it is the overlay or cache_mode or whatever) and we will resend (client) or discard (server) based on that.
I don't think this lets us handle arbitrary changes in the overlay system. Consider two clients a and b, a cache OSD, and a backing OSD. You can still get consistency issues if b and backing OSD see the overlay change before a and cache OSD do, while IO is in-progress.
Greg Farnum wrote:
I don't think this lets us handle arbitrary changes in the overlay system. Consider two clients a and b, a cache OSD, and a backing OSD. You can still get consistency issues if b and backing OSD see the overlay change before a and cache OSD do, while IO is in-progress.
If you are talking about going from overlay=cache and mode forward to no overlay, I think it is fine because the cache should be empty.
On the other hand, if we are going from no overlay to overlay=cache, the first write into the cache will trigger a promote which will ensure the base osd knows about the overlay change and no read can occur after a write.
There are surely other combinations we haven't considered, but for now I'm primarily worried about the target use cases of adding and removing a writeback cache...
- Assignee set to Sage Weil
Sage Weil wrote:
I think cache mode changes will cause similar problems. Let's add a pg_pool_t epoch_t that indicates the last policy change (whether it is the overlay or cache_mode or whatever) and we will resend (client) or discard (server) based on that.
Perhaps even simpler (and more flexible) would be:
epoch_t last_force_interval; ///< force a new interval from this epoch (including resent ops)
This makes the OSD and Objecter logic simple and reusable for other purposes.
- Status changed from New to In Progress
Discussed in standup and decided on alternate approach:
epoch_t last_force_op_resend; ///< last epoch in which we force clients to resend ops.
and a matching feature bit so that ops from old clients that don't understand this don't get their ops discarded.
- Status changed from In Progress to 7
- Status changed from 7 to Fix Under Review
- Status changed from Fix Under Review to 7
- Status changed from 7 to Pending Backport
- Status changed from Pending Backport to Resolved
Also available in: Atom
PDF