Project

General

Profile

Shingled Erasure Code (SHEC) » History » Version 3

Jessica Mack, 07/03/2015 09:58 PM

1 1 Jessica Mack
h1. Shingled Erasure Code (SHEC)
2
3
h3. Summary
4
5
Shingled Erasure Code (SHEC or original SHEC) [1] is a recovery-efficient and highly-configurable erasure code. Moreover, SHEC is extended to multiple SHEC (mSHEC), whose layout is automatically combined from several original SHEC layouts in response to durability and space efficiency each user specifies and, as a result, recovery efficiency is improved from the original SHEC.
6
7
h3. Owners
8
9
* Takeshi Miyamae (Fujitsu Laboratories Ltd., miyamae.takeshi@jp.fujitsu.com)
10
* Takanori Nakao (Fujitsu Laboratories Ltd., nakao.takanori@jp.fujitsu.com)
11
* Kensuke Shiozawa (Fujitsu Laboratories Ltd., shiozawa.kennsu@jp.fujitsu.com)
12
13
h3. Interested Parties
14
15
h3. Current Status
16
17
We have a working prototype of SHEC for Firefly/Giant, and all of its local tests are finished.
18
(Both SHEC and mSHEC work on Firefly because Firefly's erasure code plugin API and CRUSH algorithm are sufficient for them.)
19
20
h3. Detailed Description
21
 
22
Original SHEC
23
SHEC is an erasure code with local parity groups, and the calculation ranges of local parities are shifted and partly overlap with each other, similar to arranging shingles on the roof of a house. An instance of SHEC’s parity layout is shown in Figure 1.
24
 
25 3 Jessica Mack
p=. !figure21.png!
26 2 Jessica Mack
27 3 Jessica Mack
p=. *Figure 1  Instance of SHEC’s Parity Layout*
28 2 Jessica Mack
29 1 Jessica Mack
 
30
SHEC(k,m,c) means a SHEC’s layout which has k data chunks, m parity chunks and durability estimator c. Durability estimator is the average number of parity chunks which cover each data chunk.
31
 
32
SHEC has several advantages. First, SHEC is recovery efficient because SHEC always uses local parities in order to recover failed chunks even in case of multiple disk failures.
33
Next, SHEC is highly-configurable because SHEC’s specific layouts are densely plotted in recovery efficient area of three dimensional (space efficiency, recovery efficiency and durability) property map as Figure 2 shows. Reed Solomon code layouts are depicted as circled plots and are sparse, compared to the SHEC’s layouts.
34
 
35 3 Jessica Mack
p=. !figure22.png!
36 2 Jessica Mack
37 3 Jessica Mack
p=. *Figure 2  SHEC’s Property Map*
38 2 Jessica Mack
39 1 Jessica Mack
 
40
Multiple SHEC (mSHEC)
41
However, original SHEC can obtain higher recovery efficiency only at the sacrifice of durability or space efficiency. In order to reach higher recovery efficiency without the sacrifices, we combine multiple different original SHEC layouts into one (mSHEC, Figure 3).
42
mSHEC(k,m,c) automatically selects the most recovery efficient layout among all that satisfy m=m1+m2 and c=c1+c2. This is the main source of mSHEC's recovery efficiency.
43
 
44 3 Jessica Mack
p=. !figure23.png!
45 2 Jessica Mack
46 3 Jessica Mack
p=. *Figure 3  Layout of Multiple SHEC*
47 2 Jessica Mack
48 1 Jessica Mack
 
49
In contrast with original SHEC, whose recovery efficiency was almost the same as Reed Solomon code (Figure 2), mSHEC’s recovery efficiency is improved as shown in Figure 4.
50
 
51 3 Jessica Mack
p=. !figure24.png!
52
p=. *Figure 4  mSHEC’s Property Map*
53 2 Jessica Mack
54 1 Jessica Mack
 
55
At a glance, mSHEC's layouts look complicated, but mSHEC's layouts are also automatically generated in response to durability and space efficiency and users are not required to know details of the layouts.
56
 
57
In conclusion, mSHEC should be used in case where recovery efficiency is most important. Figure 5 is a diagram in order to select an appropriate EC plugin for users.
58
 
59 3 Jessica Mack
p=. !figure25.png!
60 2 Jessica Mack
61 3 Jessica Mack
p=. *Figure 5  Diagram for EC Plugin Selection*
62 2 Jessica Mack
63 1 Jessica Mack
 
64
References
65
[1] Erasure Code with Shingled Local Parity Groups for Efficient Recovery from Multiple Disk Failures (HotDep'14)
66
    https://www.usenix.org/conference/hotdep14/workshop-program/presentation/miyamae
67 2 Jessica Mack
    http://www.slideshare.net/miyamae1994/shec-hotdep14-slides021slideshare (presentation slides)
68 1 Jessica Mack
 
69
h3. Work items
70
71
Testing on Giant/Hammer releases 
72
73
h4. Coding tasks
74
75
# Task 1
76
# Task 2
77
# Task 3
78
79
h4. Build / release tasks
80
81
# Task 1
82
# Task 2
83
# Task 3
84
85
h4. Documentation tasks
86
87
# Task 1
88
# Task 2
89
# Task 3
90
91
h4. Deprecation tasks
92
93
# Task 1
94
# Task 2
95
# Task 3