Project

General

Profile

1H - Inline data support

Live Pad

The live pad can be found here: [pad]

Summit Snapshot

Coding tasks
  1. Insert extended attribute of file for MDS to store data of small files
    1. piggyback on the File cap bits, existing cap writeback mechanism
  2. MDS uninline file content when it goes into the MIX state
  3. add fields to MClientCaps
  4. Inline data is returned to the client via encode_inodestat() (used by lookup, readdir, stat, open, etc.)
  5. MDS would store inline data inside inode_t (bufferlist inline_data)
  6. Client (libcephfs, ceph-fuse)
  7. if size is small and we are flushing, flush inline to mds
    1. prototype and refine protocol changes
  8. Linux kernel client
    1. read side
      1. copy into page cache from inode buffer from readpage()
    2. write side
      1. writepage[s]() ..
      2. begin_page_writeback() ???? somethign like that... set the writeback bit, lock page
      3. if (size is small and we want to inline) {
      4. copy into the inode buffer
      5. trigger mds cap flush
      6. wait for flush
      7. } else {
      8. do the regular thing
      9. }
      10. end_page_writeback()
Documentation tasks
  1. Document the communication protocol

1 Client side

1.1 ceph_write_end()

          if (inode->status == INLINED) {
                 if (write_pos < PAGE_SIZE) {
                     write_page_to_inode();
                     err = mark_inode_dirty();
                     if (err == ESTATUS) // status has changed to NOTINLINING or NOTINLINED
                         write_page_to_osd();
                     return;
                 }
                 if (write_pos > PAGE_SIZE) {
                    inode->status = NOTINLINING;
                    mark_inode_dirty(); // ansynchoronously tell mds to change status to NOTINLINING
                 } 
                 if (the interval [write_pos, write_pos + write_len] overlap with the interval [0, PAGE_SIZE]) {
                     inode->status = NOTINLINED;
                     mark_inode_dirty();
                 }
          }
          if (inode->status == NOTINLINING) {
                 if (the interval [write_pos, write_pos + write_len] overlap with the interval [0, PAGE_SIZE]) {
                     inode->status = NOTINLINED;
                     mark_inode_dirty();
                 }
          }

1.2 write_page()
          if (page->index == 0 && inode->status == INLINED) { // for mmap(), it won't go through write_end
                err = write_page_to_inode();
                mark_inode_dirty();
          }
          write_page_to_osd();

1.3 read_page()
          if (page->index == 0 && (inode->status == INLINED || inode->status == NOTINLING)) {
                   err = copy_data_from_inode();
                   if (err == ESTATUS) // status has changed to NOTINLINED
                          read_page_from_osd();
                   return;
           }
           read_page_from_osd();