Quotas vs subtrees » History » Version 1
Jessica Mack, 07/03/2015 09:21 PM
1 | 1 | Jessica Mack | h1. Quotas vs subtrees |
---|---|---|---|
2 | 1 | Jessica Mack | |
3 | 1 | Jessica Mack | h3. Summary |
4 | 1 | Jessica Mack | |
5 | 1 | Jessica Mack | Generalize and adapt the SnapRealm subtree mechanism into a generic subvolume/subtree concept that is (1) explicitly managed/visible to the admin, (2) used by both snapshots and quotas. |
6 | 1 | Jessica Mack | |
7 | 1 | Jessica Mack | h3. Owners |
8 | 1 | Jessica Mack | |
9 | 1 | Jessica Mack | * Yunchuan Wen (yunchuanwen@ubuntukylin.com) |
10 | 1 | Jessica Mack | * Sage Weil |
11 | 1 | Jessica Mack | |
12 | 1 | Jessica Mack | h3. Interested Parties |
13 | 1 | Jessica Mack | |
14 | 1 | Jessica Mack | * Name (Affiliation) |
15 | 1 | Jessica Mack | * Name (Affiliation) |
16 | 1 | Jessica Mack | * Name |
17 | 1 | Jessica Mack | |
18 | 1 | Jessica Mack | h3. Current Status |
19 | 1 | Jessica Mack | |
20 | 1 | Jessica Mack | The snapshots break the namespace into SnapRealms, which are subtree chunks that share the same snapshot context (i.e., have the same set of snapshots applied). |
21 | 1 | Jessica Mack | New SnapRealms are created when |
22 | 1 | Jessica Mack | # a snap is created at a new point in the hierarchy. |
23 | 1 | Jessica Mack | # a subdir in one snaprealm is renamed into another snaprealm. the subdir becomes the root of a new snaprealm that nested inside the target, which a past_parent pointer to the former. |
24 | 1 | Jessica Mack | |
25 | 1 | Jessica Mack | When the new realm is created it is a 'split' event. This is some expensive and involves a message to the client that enumerates all of the ino's with client caps that need to be moved into the child realm. The client thus has a coherent view of which realm any given inode belongs to at all times. |
26 | 1 | Jessica Mack | |
27 | 1 | Jessica Mack | h3. Detailed Description |
28 | 1 | Jessica Mack | |
29 | 1 | Jessica Mack | There are some challenges with teh snaprealm code, particularly when dealing witht the past_parents relationship. This is mostly caused when opening up an inode in teh cache: we need the past_parents in order to generate a valid SnapContext for the realm, but that past parent might be in some other part of the hierarchy and take time to resolve. Until we have it, we cannot issue caps to clients, and we currently aren't smart enough to avoid doing so. There is also some very complex code that manages propagation of rstat values to past parents after a snapshot has been taken. |
30 | 1 | Jessica Mack | The whole situation would be simplified if we did not allow renaming directories between subvolumes/snaprealms. |
31 | 1 | Jessica Mack | If we did that, then there would be no past_parents. the snap issues get much simpler. |
32 | 1 | Jessica Mack | We could also make the subvolume management explicit. e.g., |
33 | 1 | Jessica Mack | attr -s mydir ceph.subvolume |
34 | 1 | Jessica Mack | or whatever, so that the admin decides where teh subvolume boundaries, and thus when -EXDEV will happen on rename. |
35 | 1 | Jessica Mack | If there *were* a subvol concept, then quotas would map onto that naturally. |
36 | 1 | Jessica Mack | What that buys us: |
37 | 1 | Jessica Mack | # clients know what root (inode) every open file belongs to, and thus what rstat value to pay attention to for quota |
38 | 1 | Jessica Mack | # same mds/client messages can manage the subvol <-> inode relationship |
39 | 1 | Jessica Mack | # when split is implemented in the future, we cna piggyback on the split messages. on the other hand, |
40 | 1 | Jessica Mack | ## snaprealms are implicitly created when you rename c from realm a to realm b. for quotas, we only care whether we are beneath b.. not that we are inside a c nested inside a and b. |
41 | 1 | Jessica Mack | ## so maybe we need to distinguish between snaprealm-things that are subvol roots and those that are not |
42 | 1 | Jessica Mack | |
43 | 1 | Jessica Mack | *Option 1* |
44 | 1 | Jessica Mack | # rename SnapRealm to SubvolRealm |
45 | 1 | Jessica Mack | # rename MClientSnap message to MClientSubvol or similar |
46 | 1 | Jessica Mack | # separate new realm creation into an explicit subvol creation op, triggered by a vxattr or new mds op |
47 | 1 | Jessica Mack | # only allow quotas to be set on subvol roots |
48 | 1 | Jessica Mack | # use existing snapbl (renamed subvolbl) to associate all inodes with the subvol root |
49 | 1 | Jessica Mack | # [maybe] allow rename between subvols with no snaps |
50 | 1 | Jessica Mack | ## add a new MOVE op, distinct but similar to split, that simply moves inodes to a different realm. this will be used when you rename a dir between subvols. |
51 | 1 | Jessica Mack | # [someday] enable rename between subvols with snaps |
52 | 1 | Jessica Mack | ## add a SubvolRealm property that indicates whether it si a subvol root or not |
53 | 1 | Jessica Mack | ## make split work to enable snaps vs renames. |
54 | 1 | Jessica Mack | ## mds: fix things with opening past_parents |
55 | 1 | Jessica Mack | |
56 | 1 | Jessica Mack | *Option 2* |
57 | 1 | Jessica Mack | # add a new qtree (or subvol) construct |
58 | 1 | Jessica Mack | # instantiate in client cache and mds cache |
59 | 1 | Jessica Mack | # chain all inodes to the subvol they belong to |
60 | 1 | Jessica Mack | # mark subvol in any inodestat reply to client |
61 | 1 | Jessica Mack | # add a new MOVE message used on rename |