Shingled Erasure Code (SHEC)


Shingled Erasure Code (SHEC or original SHEC) [1] is a recovery-efficient and highly-configurable erasure code. Moreover, SHEC is extended to multiple SHEC (mSHEC), whose layout is automatically combined from several original SHEC layouts in response to durability and space efficiency each user specifies and, as a result, recovery efficiency is improved from the original SHEC.


Interested Parties

Current Status

We have a working prototype of SHEC for Firefly/Giant, and all of its local tests are finished.
(Both SHEC and mSHEC work on Firefly because Firefly's erasure code plugin API and CRUSH algorithm are sufficient for them.)

Detailed Description

1. Original SHEC
SHEC is an erasure code with local parity groups, and the calculation ranges of local parities are shifted and partly overlap with each other, similar to arranging shingles on the roof of a house. An instance of SHEC’s parity layout is shown in Figure 1.

Figure 1 Instance of SHEC’s Parity Layout

SHEC (k,m,c) means a SHEC’s layout which has k data chunks, m parity chunks and durability estimator c. Durability estimator is the average number of parity chunks which cover each data chunk.

SHEC has several advantages. First, SHEC is recovery efficient because SHEC always uses local parities in order to recover failed chunks even in case of multiple disk failures.
Next, SHEC is highly-configurable because SHEC’s specific layouts are densely plotted in recovery efficient area of three dimensional (space efficiency, recovery efficiency and durability) property map as Figure 2 shows. Reed Solomon code layouts are depicted as circled plots and are sparse, compared to the SHEC’s layouts.

Figure 2 SHEC’s Property Map

2. Multiple SHEC (mSHEC)
However, original SHEC can obtain higher recovery efficiency only at the sacrifice of durability or space efficiency. In order to reach higher recovery efficiency without the sacrifices, we combine multiple different original SHEC layouts into one (mSHEC, Figure 3).
mSHEC(k,m,c) automatically selects the most recovery efficient layout among all that satisfy m=m1+m2 and c=c1+c2. This is the main source of mSHEC's recovery efficiency.

Figure 3 Layout of Multiple SHEC

In contrast with original SHEC, whose recovery efficiency was almost the same as Reed Solomon code (Figure 2), mSHEC’s recovery efficiency is improved as shown in Figure 4.

Figure 4 mSHEC’s Property Map

At a glance, mSHEC's layouts look complicated, but mSHEC's layouts are also automatically generated in response to durability and space efficiency and users are not required to know details of the layouts.

In conclusion, mSHEC should be used in case where recovery efficiency is relatively important. Figure 5 is a diagram in order to select an appropriate EC plugin for users.

Figure 5 Diagram for EC Plugin Selection

[1] Erasure Code with Shingled Local Parity Groups for Efficient Recovery from Multiple Disk Failures (HotDep'14) (presentation slides)

Work items

Testing on Giant/Hammer releases

Coding tasks

  1. Task 1
  2. Task 2
  3. Task 3

Build / release tasks

  1. Task 1
  2. Task 2
  3. Task 3

Documentation tasks

  1. Task 1
  2. Task 2
  3. Task 3

Deprecation tasks

  1. Task 1
  2. Task 2
  3. Task 3

figure21.png View (59.5 KB) Jessica Mack, 07/03/2015 09:52 PM

figure22.png View (91.7 KB) Jessica Mack, 07/03/2015 09:52 PM

figure23.png View (14.3 KB) Jessica Mack, 07/03/2015 09:52 PM

figure24.png View (44.2 KB) Jessica Mack, 07/03/2015 09:52 PM

figure25.png View (14.2 KB) Jessica Mack, 07/03/2015 09:52 PM

figure251.png View (11.1 KB) Takeshi Miyamae, 10/20/2015 06:23 AM