Towards Ceph Cold Storage » History » Version 2
Jessica Mack, 07/03/2015 10:06 PM
1 | 1 | Jessica Mack | h1. Towards Ceph Cold Storage |
---|---|---|---|
2 | 1 | Jessica Mack | |
3 | 1 | Jessica Mack | h3. Summary |
4 | 1 | Jessica Mack | |
5 | 1 | Jessica Mack | We'd like to continue the discussion about cold storage support with possible implementation ideas. |
6 | 1 | Jessica Mack | |
7 | 1 | Jessica Mack | h3. Owners |
8 | 1 | Jessica Mack | |
9 | 1 | Jessica Mack | * Matthias Grawinkel (Johannes Gutenberg-Universität Mainz, grawinkel@uni-mainz.de) |
10 | 1 | Jessica Mack | * Marcel Lauhoff (Universität Paderborn, Student, ml@irq0.org) |
11 | 1 | Jessica Mack | |
12 | 1 | Jessica Mack | h3. Interested Parties |
13 | 1 | Jessica Mack | |
14 | 1 | Jessica Mack | h3. Current Status |
15 | 1 | Jessica Mack | |
16 | 1 | Jessica Mack | * Ideas |
17 | 1 | Jessica Mack | * Master's thesis topic |
18 | 1 | Jessica Mack | * Related Blueprint: https://wiki.ceph.com/Planning/Blueprints/%3CSIDEBOARD%3E/Cold_Storage_Pools |
19 | 1 | Jessica Mack | |
20 | 1 | Jessica Mack | h3. Detailed Description |
21 | 1 | Jessica Mack | |
22 | 1 | Jessica Mack | We think the following four features are necessary to build a Ceph cold storage system. The user stories below describe how they relate to each other. |
23 | 1 | Jessica Mack | |
24 | 1 | Jessica Mack | h4. 1. CRUSH: Energy Aware Buckets |
25 | 1 | Jessica Mack | |
26 | 1 | Jessica Mack | First of all we need the power state of every OSD. Using this information we could teach CRUSH to: |
27 | 1 | Jessica Mack | * Favor OSDs that are powered up |
28 | 1 | Jessica Mack | * Actively allow OSDs to power down by assigning weights that prevent certain OSDs to get selected. For example in a time-based round robin fashion. |
29 | 1 | Jessica Mack | |
30 | 1 | Jessica Mack | Example: With three buckets and two replicas. Two of the buckets are powered up; one is powered down. The placement algorithm only selects the two powered up buckets. After an hour in this configuration on of the up buckets switches to down. The down buckets becomes up. It may also be a good idea to switch the primary OSD state for PGs around to the powered on buckets. |
31 | 1 | Jessica Mack | |
32 | 1 | Jessica Mack | h4. 2. Object Stubs - Links to external storage |
33 | 1 | Jessica Mack | |
34 | 1 | Jessica Mack | Basically: objects without their data, but the information about where to retrieve their data. |
35 | 1 | Jessica Mack | Two features are necessary: |
36 | 1 | Jessica Mack | # Objects store references to external storage |
37 | 1 | Jessica Mack | # OSDs have a fetcher to retrieve external data |
38 | 1 | Jessica Mack | |
39 | 1 | Jessica Mack | Clients must never access external storage directly. On access to externalized objects the OSDs fetcher retrieves and re-integrates the object back to the active storage pool. |
40 | 1 | Jessica Mack | Examples for external storage systems: |
41 | 1 | Jessica Mack | * LONESTAR RAID |
42 | 1 | Jessica Mack | * Ethernet drives |
43 | 1 | Jessica Mack | * Tapes |
44 | 1 | Jessica Mack | * Cloud Storage |
45 | 1 | Jessica Mack | |
46 | 1 | Jessica Mack | h4. 3. Archiving daemon |
47 | 1 | Jessica Mack | |
48 | 1 | Jessica Mack | Most HSM systems employ a data promotion and demotion daemon. A file could, for example, be demoted to slower storage after a certain time it wasn't accessed. Using Object Stubs described above this daemon could move cold data to the external archive system and create references. |
49 | 1 | Jessica Mack | |
50 | 1 | Jessica Mack | h4. 4. Archive System OSDs |
51 | 2 | Jessica Mack | |
52 | 1 | Jessica Mack | So far OSDs support mostly filesystem and key value stores as object storage backends. For archive storage it could be useful to add object stores directly interacting with an archive system. |
53 | 1 | Jessica Mack | |
54 | 1 | Jessica Mack | h4. User Stories: |
55 | 1 | Jessica Mack | |
56 | 1 | Jessica Mack | h4. Object Stubs, Archiving Daemon, External Archive System |
57 | 1 | Jessica Mack | |
58 | 1 | Jessica Mack | Primary (warm) Ceph system serves clients directly. Archive system uses a different storage technology like LONESTAR. Archive daemon periodically scans through object metadata and moves /cold/ objects to the archive system. It replaces the former warm objects with stubs pointing to the location in the archive system. (There still remains the problem on how to effectively place data on |
59 | 1 | Jessica Mack | multiple archive systems) |
60 | 1 | Jessica Mack | |
61 | 2 | Jessica Mack | h4. Object Stubs, Archiving Daemon, Archive System OSDs? |
62 | 1 | Jessica Mack | |
63 | 1 | Jessica Mack | The archive system uses Ceph. Ceph is configured to provide object placement but not redundancy. OSDs use a object store backend that handles energy efficiency and redundancy like the LONESTAR RAID. |
64 | 1 | Jessica Mack | |
65 | 1 | Jessica Mack | h4. Object Stubs, Archiving Daemon, Ceph Archive System |
66 | 1 | Jessica Mack | |
67 | 1 | Jessica Mack | Use energy aware placement strategies. Ceph is configured to provide placement and redundancy. |
68 | 1 | Jessica Mack | |
69 | 1 | Jessica Mack | h3. Work items |
70 | 1 | Jessica Mack | |
71 | 1 | Jessica Mack | h4. Coding tasks |
72 | 1 | Jessica Mack | |
73 | 1 | Jessica Mack | # Task 1 |
74 | 1 | Jessica Mack | # Task 2 |
75 | 1 | Jessica Mack | # Task 3 |
76 | 1 | Jessica Mack | |
77 | 1 | Jessica Mack | h4. Build / release tasks |
78 | 1 | Jessica Mack | |
79 | 1 | Jessica Mack | # Task 1 |
80 | 1 | Jessica Mack | # Task 2 |
81 | 1 | Jessica Mack | # Task 3 |
82 | 1 | Jessica Mack | |
83 | 1 | Jessica Mack | h4. Documentation tasks |
84 | 1 | Jessica Mack | |
85 | 1 | Jessica Mack | # Task 1 |
86 | 1 | Jessica Mack | # Task 2 |
87 | 1 | Jessica Mack | # Task 3 |
88 | 1 | Jessica Mack | |
89 | 1 | Jessica Mack | h4. Deprecation tasks |
90 | 1 | Jessica Mack | |
91 | 1 | Jessica Mack | # Task 1 |
92 | 1 | Jessica Mack | # Task 2 |
93 | 1 | Jessica Mack | # Task 3 |