OSD - add flexible cache control of object data » History » Version 2
Jessica Mack, 07/03/2015 08:36 PM
1 | 1 | Jessica Mack | h1. OSD - add flexible cache control of object data |
---|---|---|---|
2 | |||
3 | h3. Summary |
||
4 | |||
5 | By default OSD which use fs as backend will try to cache all objects in memory after each write. Releasing those page cache depends on kernel settings (/proc/sys/vm*). |
||
6 | However in a typical large cluster, this does not make much sense: |
||
7 | 1) Most of the objects will not be serviced immediately after written. |
||
8 | 2) Keep large objects in memory consumes too much memory. |
||
9 | 3) If there is a cache tier, the data will not be accessed in a short time mostly once it’s promoted from the base. |
||
10 | 4) Although, it’s capable to drop data page cache by changing value of /proc/fs/vfs_drop_cache. But this works at system level and don't have good flexibility obviously. |
||
11 | |||
12 | We propose a new feature that allows OSD to drop buffer cache if data will not be access in the near future. |
||
13 | |||
14 | |||
15 | The advantage of this feature: |
||
16 | 1: Save many memory to use as other like inode/denty. |
||
17 | 2: Using small memory host as storage node. |
||
18 | |||
19 | h3. Owners |
||
20 | |||
21 | * Jianpeng Ma(intel) |
||
22 | * Yuan Zhou(intel) |
||
23 | * Jian Zhang(intel) |
||
24 | * Jiangang Duan(intel) |
||
25 | |||
26 | h3. Interested Parties |
||
27 | |||
28 | * Name (Affiliation) |
||
29 | * Name (Affiliation) |
||
30 | * Name |
||
31 | |||
32 | h3. Current Status |
||
33 | |||
34 | 2 | Jessica Mack | h3. Detailed Description |
35 | |||
36 | 1 | Jessica Mack | 1: The granularity of dropping data page cache are |
37 | A: all ceph cluster |
||
38 | B: pool like erasure pool |
||
39 | C: object which set a flag to indicate it will drop data page cache. |
||
40 | |||
41 | 2: For write-operation |
||
42 | A: for osd-op |
||
43 | In FileStore::_do_transaction, for write object, record the cid, oid, offset, len. |
||
44 | After FileStore::sync_entry, we can start to drop the object. |
||
45 | B: for subop |
||
46 | In handle_message, we record the write object info(cid, oid, offset, len) |
||
47 | After FileStore::sync_entry, we can start to drop the object. |
||
48 | 3: For read-operation |
||
49 | In FileStore::read, add a bool or flag to whether drop page cache after read operation. |
||
50 | 4: For recovery/scrub/repair and so on, we also use this method. |
||
51 | |||
52 | h3. Work items |
||
53 | |||
54 | 2 | Jessica Mack | h4. Coding tasks |
55 | 1 | Jessica Mack | |
56 | # Task 1 |
||
57 | # Task 2 |
||
58 | # Task 3 |
||
59 | |||
60 | h4. Build / release tasks |
||
61 | |||
62 | # Task 1 |
||
63 | # Task 2 |
||
64 | # Task 3 |
||
65 | |||
66 | h4. Documentation tasks |
||
67 | |||
68 | # Task 1 |
||
69 | # Task 2 |
||
70 | # Task 3 |
||
71 | |||
72 | h4. Deprecation tasks |
||
73 | |||
74 | # Task 1 |
||
75 | # Task 2 |
||
76 | # Task 3 |