Project

General

Profile

OSD - add flexible cache control of object data » History » Version 1

Jessica Mack, 07/03/2015 08:36 PM

1 1 Jessica Mack
h1. OSD - add flexible cache control of object data
2
3
h3. Summary
4
5
By default OSD which use fs as backend will try to cache all objects in memory after each write. Releasing those page cache depends on kernel settings (/proc/sys/vm*).
6
However in a typical large cluster, this does not make much sense:
7
1) Most of the objects will not be serviced immediately after written.
8
2) Keep large objects in memory consumes too much memory.
9
3) If there is a cache tier, the data will not be accessed in a short time mostly once it’s promoted from the base.
10
4) Although, it’s capable to drop data page cache by changing value of /proc/fs/vfs_drop_cache. But this works at system level and don't have good flexibility obviously.
11
 
12
We propose a new feature that allows OSD to drop buffer cache if data will not be access in the near future.
13
 
14
 
15
The advantage of this feature:
16
1: Save many memory to use as other like inode/denty.
17
2: Using small memory host as storage node.
18
19
h3. Owners
20
21
* Jianpeng Ma(intel)
22
* Yuan  Zhou(intel)
23
* Jian Zhang(intel)
24
* Jiangang Duan(intel)
25
26
h3. Interested Parties
27
28
* Name (Affiliation)
29
* Name (Affiliation)
30
* Name
31
32
h3. Current Status
33
 
34
Detailed Description
35
  1: The granularity of dropping data page cache are
36
  A: all ceph cluster
37
  B: pool like erasure pool
38
  C: object which set a flag to indicate it will drop data page cache.
39
 
40
2: For write-operation
41
  A: for osd-op
42
     In FileStore::_do_transaction, for write object, record the cid, oid, offset, len.
43
     After FileStore::sync_entry, we can start to drop the object.
44
  B: for subop
45
     In handle_message, we record the write object info(cid, oid, offset, len)
46
     After FileStore::sync_entry, we can start to drop the object.
47
3: For read-operation
48
    In FileStore::read, add a bool or flag to whether drop page cache after read operation.
49
4: For recovery/scrub/repair and so on, we also use this method.
50
51
h3. Work items
52
53
h4 Coding tasks
54
55
# Task 1
56
# Task 2
57
# Task 3
58
59
h4. Build / release tasks
60
61
# Task 1
62
# Task 2
63
# Task 3
64
65
h4. Documentation tasks
66
67
# Task 1
68
# Task 2
69
# Task 3
70
71
h4. Deprecation tasks
72
73
# Task 1
74
# Task 2
75
# Task 3