Project

General

Profile

Osd - ceph on zfs » History » Version 1

Jessica Mack, 06/21/2015 03:55 AM

1 1 Jessica Mack
h1. Osd - ceph on zfs
2 1 Jessica Mack
3 1 Jessica Mack
h3. Summary
4 1 Jessica Mack
5 1 Jessica Mack
Allow ceph-osd to better use of ZFS's capabilities.
6 1 Jessica Mack
7 1 Jessica Mack
h3. Owners
8 1 Jessica Mack
9 1 Jessica Mack
* Sage Weil (Inktank)
10 1 Jessica Mack
11 1 Jessica Mack
h3. Interested Parties
12 1 Jessica Mack
13 1 Jessica Mack
* Sage Weil (Inktank)
14 1 Jessica Mack
* Mark Nelson (Inktank)
15 1 Jessica Mack
* Yan, Zheng (Intel)
16 1 Jessica Mack
* Haomai Wang (UnitedStack)
17 1 Jessica Mack
* Wido den Hollander (42on)
18 1 Jessica Mack
* Eric Eastman (Keeper Technology)
19 1 Jessica Mack
* Daniele Stroppa (ZHAW)
20 1 Jessica Mack
* Sam Zaydel (RackTop Systems)
21 1 Jessica Mack
* Sam Just (Inktank)
22 1 Jessica Mack
23 1 Jessica Mack
h3. Current Status
24 1 Jessica Mack
25 1 Jessica Mack
We have worked to identify and fix the xattr bugs in zfsonlinux such that ceph-osd will run on top of ZFS in the noraml write-ahead journaling mode, just as it will on ext4 or XFS.  We do not take advantage of any special ZFS features.
26 1 Jessica Mack
27 1 Jessica Mack
h3. Detailed Description
28 1 Jessica Mack
29 1 Jessica Mack
At a minimum, ZFS's snapshot support could be used the same way it is used on btrfs to provide a stable consistency point to journal relative too, allowing us to use the parallel jounraling mode (which has much better read/modify/write performance).
30 1 Jessica Mack
Looking further forward, I suspect there are much more involved ways that we could take advantage of ZFS, by utilizing the DMU directly instead of using the posix layer.  I would like to discuss both the short-term improvements as well as the long-term possibilities in this session.
31 1 Jessica Mack
To abstract the underlying fs functionality out of FileStore, we need an interface that looks like like this:
32 1 Jessica Mack
class BackingFileSystem {
33 1 Jessica Mack
34 1 Jessica Mack
p((. bool can_checkpoint();   ///< true if we can snapshot to allow parallel journaling, etc.
35 1 Jessica Mack
int create_base_volume();    ///< use during mkfs.. mkdir in the degenerate case, create_subvole for btrfs, ...
36 1 Jessica Mack
int list_checkpoints(list<string> *ls);   ///< used during mount.  list the checkpoints
37 1 Jessica Mack
int rollback_to_checkpoint(string name);   ///< used during mount to roll back to the last checkpoint befor ejournal replay
38 1 Jessica Mack
int create_checkpoint_start(string name);  ///< start a snap.  during sync_entry()
39 1 Jessica Mack
int create_checkpoint_finish();
40 1 Jessica Mack
int remove_checkpoint(string name);  ///< trim an old snap 
41 1 Jessica Mack
 
42 1 Jessica Mack
p((. // other btrfs/fs optimizations
43 1 Jessica Mack
int clone_range(...);   ///< fall back to copy as necessary
44 1 Jessica Mack
 
45 1 Jessica Mack
};
46 1 Jessica Mack
The FileStore::_detect_fs() will need to be refactored to instantiate an implementation of the above instead of the current open-coded checks.
47 1 Jessica Mack
All references to btrfs_stable_commits will be repalced with can_checkpoint().
48 1 Jessica Mack
Once this refactoring is in place, implementing a zfs backend should be pretty straightforward.
49 1 Jessica Mack
TODO:
50 1 Jessica Mack
* identify correct zfs snap interface (ioctls?)
51 1 Jessica Mack
* look at nilfs2?
52 1 Jessica Mack
53 1 Jessica Mack
h3. Work items
54 1 Jessica Mack
55 1 Jessica Mack
h4. Coding tasks
56 1 Jessica Mack
57 1 Jessica Mack
# filestore: generalize the snapshot enumeration, creation hooks and other btrfs-specific behaviors such that the btrfs hooks fit into a generic interface
58 1 Jessica Mack
# filestore: implement generic backend (xfs, ext4, etc.)
59 1 Jessica Mack
# filestore: implement btrfs backend
60 1 Jessica Mack
# filestore: clean out all btrfs_* member cruft
61 1 Jessica Mack
# filestore: implement a zfs backend that triggers zfs snapshots
62 1 Jessica Mack
# ceph-deploy: add zfs to the list of file systems supported by osd create ...
63 1 Jessica Mack
64 1 Jessica Mack
h4. Build / release tasks
65 1 Jessica Mack
66 1 Jessica Mack
# include zfsonlinux in ceph-qa-chef on supported platforms
67 1 Jessica Mack
# teuthology: add support for fs: zfs
68 1 Jessica Mack
# include fs:zfs in the rados test matrix
69 1 Jessica Mack
70 1 Jessica Mack
h4. Documentation tasks
71 1 Jessica Mack
72 1 Jessica Mack
# document the filestore backend interface in the internals section of the docs