Bug #3860
closedrbd: problems if watch setup returns ERANGE
0%
Description
When rbd sets up the watch request for a newly-mapped rbd image
it loops and tries again if the request returns ERANGE. Josh
mentioned a while ago this can happen for some reason that I
don't remember, and that trying again is the appropriate response.
But the code that sets up the watch event and tracks the osd
request for that watch doesn't clean up that state properly.
The watch request pointer stays non-null, and retrying will
overwrite that pointer. I haven't followed this through yet
but I doubt the ERANGE return will properly clean up the
request (which will have been registered to linger).
A quick fix would be to simply return the error--even if
it's ERANGE--and just fail the mapping.
Longer term though we should do this looping, but we need
to be able to clean up in the event of an error.
And doing that most likely depends on resolving this:
http://tracker.newdream.net/issues/3859
Updated by Josh Durgin about 11 years ago
ERANGE is never actually returned - it was never implemented (#2592). The real fix for the race it was intended to prevent is to do the watch on the header before reading its contents, so that we don't miss any notifies before reading the header. I'll open another task for that.
Updated by Alex Elder about 11 years ago
Josh rejected this. But since he said that the
change I proposed--to not do the loop--was OK
I suggest this bug should be used to track that
particular change.
The longer term fix is documented in another bug,
http://www.tracker.newdream.net/issues/3871
Updated by Alex Elder about 11 years ago
Just to close this out...
The fix (not repeating no ERANGE) has been committed:
commit c04306471ad93f1daf60771a0373316d4c3494ae
Author: Alex Elder <elder@inktank.com>
Date: Fri Jan 18 12:31:09 2013 -0600
rbd: don't retry setting up header watch