Project

General

Profile

Optimize Newstore for massive small object storage » History » Version 1

Xiaoxi Chen, 06/12/2015 06:30 AM

1 1 Xiaoxi Chen
h3. *Optimize Newstore for massive small objects storage*
2
3
*Summary*
4
There are more and more companies adopting Ceph as their storage solution,   ceph is doing extremely well in RBD and large object storage , but as a lot of results from both Intel and other user clearing showing the issue of Ceph in “Lots Of Small File” issue.
5
In LOSF case, the average object size is as small as 10s to 100s KB, which is usually the size of a compressed image/HTML/Text/Pdf.  In the current approach , the objects will live on the FS as individual files,  which usually means millions of files in FS.  This will over-run the FS and introduce large read/write amplification since every IO need to go through the whole tree.
6
Newstore introduced fragement_list, which de-coupled the logical object and physical location., and it could use open_by_handler to reduce the cost of tree-traverse. From the first design ,we allow one object to have multiple fragment, now we would like to extend the object->fragment mapping from 1: N to N: M, that means, we want to make multiple object sharing one fragment.
7
8
9
*Owners*
10
Xiaoxi CHEN (Intel)
11
12
*Interested Parties*
13
14
Xiaoxi CHEN (Intel)
15
Jian Zhang (Intel)
16
17
*Current Status*
18
19
There are existing facilities in newstore,  in fragement_t, we already have an offset and lengh to the file.
20
struct fragment_t {
21
  uint32_t offset;   ///< offset in file to first byte of this fragment
22
  uint32_t length;   ///< length of fragment/extent
23
  fid_t fid;         ///< file backing this fragment
24
25
26
*Detailed Description*
27
This is the big one!  Please provide a detailed description for the proposed change.  Where appropriate, include your architectural approach, a list of systems involved, important consequences, and issues that are still unresolved.
28
29
*Work items*
30
This section should contain a list of work tasks created by this blueprint.  Please include engineering tasks as well as related build/release and documentation work.  If this blueprint requires cleanup of deprecated features, please list those tasks as well.
31
32
*Coding tasks*
33
Task 1
34
Task 2
35
Task 3
36
37
*Build / release tasks*
38
Task 1
39
Task 2
40
Task 3
41
42
*Documentation tasks*
43
Task 1
44
Task 2
45
Task 3
46
47
*Deprecation tasks*
48
Task 1
49
Task 2
50
Task 3