Project

General

Profile

rbd - copy-on-read for clones

Summary

Implement copy-on-read for rbd clones

Owners

Interested Parties

  • Name (Affiliation)
  • Name (Affiliation)
  • Name

Current Status

Detailed Description

This blueprint is originally proposed at
http://wiki.ceph.com/Planning/Sideboard/rbd%3A_copy-on-read_for_clones.

Rbd clone is currently implemented in the traditional way, copy-on-write, that is, only modified
data will be saved in clones, unmodified data will retrieve from parents. Since clones and their
parents may be placed in different pools across hosts, the read performance of clones may be poor.
Copy-on-read would improve locality for data which will be frequently accessed by a clone.

To improve the read performance for clones, copy-on-read is desired, that is, the data read by
clones from their parents will be saved into clones as well. Then subsequent read will hit from
clones. This is potentially useful in vitualization situations, that is, make a snapshot from a
parent image, then make several clones from the snapshot, each clone will be the backend of a
virutal machine.

Work items

Coding tasks

  1. add an copy-on-read option in file config_opts.h
  2. retrieve the corresponding objects from parent
    librbd/internal.cc aio_read()
    This function invokes Striper::file_to_extents() to converts image_extent into object_extents,
    for each object_extent in object_extents,
    let objectno = object_extent->objectno;
    let off = 0;
    invoke Striper::extent_to_file(..., objectno, off, ...) to retrieve the offset of the object into the image,
    object_offset,
    invoke Striper::file_to_extent(..., object_offset, object_size, ...) to retrieve all the object_extents inside
    the object
    asynchronously read these object_extents from parent into another buffer list prefetch_buffer
  1. write the objects into clones
    librbd/AioRequest.cc AioRead::should_complete(int r)
    after the completion of read_parent(), consruct a AioWrite request to asynchronously write object_extents into clones

Build / release tasks

  1. Task 1
  2. Task 2
  3. Task 3

Documentation tasks

  1. Task 1
  2. Task 2
  3. Task 3

Deprecation tasks

  1. Task 1
  2. Task 2
  3. Task 3