krbd: read path
#1 Updated by Alex Elder about 11 years ago
I'm about to mark bug 3417 as a duplicate of this.
I'm putting this bit of info from there here first.
Work on this really started in November 2012.
In October there were a number of cleanup tasks we agreed
should get done first, and the the first "real" task was
to implement fetching information about an image's parent
if present. That work was completed (but from what I can
tell had no issue assigned to it).
After several false starts in November attempting to
implement the read path (redirecting read requests with
ENOENT responses to the parent image) I found the existing
code and data structures would not very easily support
this different source of I/O requests to OSD's. After
some discussion, we agreed that doing a sort of
reimplementation was a reasonable course forward.
That work was nearly complete in November, but numerous
other things have preempted further progress. Specifically,
I have been working on expanding xfstests coverage over rbd,
setting up, running, and recovering from the failures hit
when network connection failures are in injected.
I'll return to this task as soon as possible. I had hoped
to have this completed in November, now I'm hopeful for
the end of December.
#2 Updated by Alex Elder about 11 years ago
This task depends on the completion of the following others
before it can be completed:
3741 krbd: rework request tracking code
And that currently depends on:
3754 krbd: use new request tracking code for notify ack
3755 krbd: use new request tracking code for sync object operations
#5 Updated by Alex Elder about 11 years ago
- Status changed from New to In Progress
With my patches for the basic new request code now
out for initial review, I've started working on this
feature. It may require a few smaller steps to complete
but the new request code should actually make this pretty
#7 Updated by Alex Elder about 11 years ago
This work is close to complete, in that I have shown
that a lot of functionality seems to work correctly.
I was never able to do the thorough testing I wanted
to though. This code has also had to be reworked a
little bit as I've done some development on the write
At this point I'm not sure how much more effort this
will require to really complete, it depends on whether
testing shows any surprising problems.
#12 Updated by Alex Elder almost 11 years ago
I have the read path code mostly working now. The problem was
that an object request that gets redirected to a parent image
needs to get its byte transfer count updated. That wasn't
happening. I now have a image request transfer count that
gets copied into the object request that submitted the
parent image request.
So all that seems to work but unfortunately the mapping
of a clone of a clone case is still returning zero bytes.
(Could be a similar problem, I don't know at this point.)
More looking at that in detail in the morning.
I also hope to get get my layered read test script
hammered out so I can use it.
#13 Updated by Alex Elder almost 11 years ago
I have identified two problems that I was hitting.
First, it was not possible for me to map a format 2 rbd image,
because there is a new feature RBD_FEATURE_STRIPINGV2 that
is marked as incompatible by the osd server (i.e., if the
client doesn't support it, it claims they can't interact).
So I'm addressing that by pretending I support that feature
(but I think it shouldn't be mandatory unless a particular
image uses the feature).
Second, the rbd CLI was exiting with status 0 (success)
even though the request to map a version 2 image failed.
I opened a new bug for that:
#14 Updated by Alex Elder almost 11 years ago
OK, that feature bit was my problem. I am now able to
successfully map a version 2 image.
Having done that I created a clone and read from it,
and it my initial simple tests produced the same
data as the original. I'll try my big test on it
shortly, but first I'm going to try a clone-of-a-clone.
#17 Updated by Alex Elder almost 11 years ago
Double fuckin' A. (Fuckin' double-A?)
I just updated my test to validate snapshot-of-clone and
clone-of-snapshot of clone and that worked fine too.
I'm about to check in my updated test. Note that the only
thing this is doing is verifying that the read data matches
the data found in the original image.
It would be good to validate that data written to a clone
at a given offset is read back correctly. For example:
write 16MB at offset 0 in original
--> all three when mapped should produce the same
data when reading any offset
write 4K at offset 4MB in the clone
--> original and snapshot should all still be
--> clone should be identical for first 4MB,
different for 2nd 4MB, identical beyond 8MB
I sort of just did this and it seems to work...
#18 Updated by Alex Elder almost 11 years ago
- Status changed from In Progress to Fix Under Review
The following series has been posted for review.
This series puts in place code that is able to handle
read requests on rbd clone images, forwarding them to
a parent snapshot image if necessary. Missing from this
series is a temporary patch at the end which actually
activates this functionality. That will not go in
until the rest of rbd layering functionality is in place.
I'm going to restate that. This code implements reads
for layered rbd images, but the functionality will not
be usable quite yet.
Most of the series is adding flags to image and object
requests, and putting in place some accounting in (parent)
image requests so it can be used to record the completion
of an object request that initiated it.
[PATCH 01/11] rbd: record overall image request result
[PATCH 02/11] rbd: record aggregate image transfer count
[PATCH 03/11] rbd: record image-relative offset in object requests
[PATCH 04/11] rbd: define image request flags
[PATCH 05/11] rbd: define image request originator flag
[PATCH 06/11] rbd: define image request layered flag
[PATCH 07/11] rbd: encapsulate image object end request handling
[PATCH 08/11] rbd: define an rbd object request flags field
[PATCH 09/11] rbd: add an object request flag for image data objects
[PATCH 10/11] rbd: probe the parent of an image if present
[PATCH 11/11] rbd: implement layered reads
#19 Updated by Alex Elder almost 11 years ago
- Status changed from Fix Under Review to Resolved
The following have been committed to the ceph-client
745c34c rbd: implement layered reads
3c38b56 rbd: probe the parent of an image if present
816ca07 rbd: add an object request flag for image data objects
c1dc6f1 rbd: define an rbd object request flags field
12c59f1 rbd: encapsulate image object end request handling
8abef6e rbd: define image request layered flag
f3f9aa9 rbd: define image request originator flag
d837d4c rbd: define image request flags
20a66dc rbd: record image-relative offset in object requests
f8e19ce rbd: record aggregate image transfer count
43eb648 rbd: record overall image request result
The LAYERING feature has not yet been enabled, but the above
changes complete the addition of support for the layered