Project

General

Profile

Towards Ceph Cold Storage » History » Version 1

Jessica Mack, 07/03/2015 10:05 PM

1 1 Jessica Mack
h1. Towards Ceph Cold Storage
2
3
h3. Summary
4
5
We'd like to continue the discussion about cold storage support with possible implementation ideas.
6
 
7
h3. Owners
8
9
* Matthias Grawinkel (Johannes Gutenberg-Universität Mainz, grawinkel@uni-mainz.de)
10
* Marcel Lauhoff (Universität Paderborn, Student, ml@irq0.org)
11
12
h3. Interested Parties
13
14
h3. Current Status
15
16
* Ideas
17
* Master's thesis topic
18
* Related Blueprint: https://wiki.ceph.com/Planning/Blueprints/%3CSIDEBOARD%3E/Cold_Storage_Pools
19
20
h3. Detailed Description
21
22
We think the following four features are necessary to build a Ceph cold storage system. The user stories below describe how they relate to each other.
23
24
h4. 1. CRUSH: Energy Aware Buckets
25
26
First of all we need the power state of every OSD. Using this information we could teach CRUSH to:
27
* Favor OSDs that are powered up
28
* Actively allow OSDs to power down by assigning weights that prevent certain OSDs to get selected. For example in a time-based round robin fashion.
29
 
30
Example: With three buckets and two replicas. Two of the buckets are powered up; one is powered down. The placement algorithm only selects the two powered up buckets. After an hour in this configuration on of the up buckets switches to down. The down buckets becomes up. It may also be a good idea to switch the primary OSD state for PGs around to the powered on buckets.
31
32
h4. 2. Object Stubs - Links to external storage
33
34
Basically: objects without their data, but the information about where to retrieve their data.
35
Two features are necessary:
36
# Objects store references to external storage
37
# OSDs have a fetcher to retrieve external data
38
39
Clients must never access external storage directly. On access to externalized objects the OSDs fetcher retrieves and re-integrates the object back to the active storage pool.
40
Examples for external storage systems:
41
* LONESTAR RAID
42
* Ethernet drives
43
* Tapes
44
* Cloud Storage
45
46
h4. 3. Archiving daemon
47
48
Most HSM systems employ a data promotion and demotion daemon. A file could, for example, be demoted to slower storage after a certain time it wasn't accessed. Using Object Stubs described above this daemon could move cold data to the external archive system and create references.
49
50
h4. 4. Archive System OSDs
51
52
So far OSDs support mostly filesystem and key value stores as object storage backends. For archive storage it could be useful to add object stores directly interacting with an archive system.
53
54
h4. User Stories:
55
56
h4. Object Stubs, Archiving Daemon, External Archive System
57
58
Primary (warm) Ceph system serves clients directly. Archive system uses a different storage technology like LONESTAR. Archive daemon periodically scans through object metadata and moves /cold/ objects to the archive system. It replaces the former warm objects with stubs pointing to the location in the archive system. (There still remains the problem on how to effectively place data on
59
multiple archive systems)
60
61
h4. Object Stubs, Archiving Daemon, Archive System OSDs​
62
63
The archive system uses Ceph. Ceph is configured to provide object placement but not redundancy. OSDs use a object store backend that handles energy efficiency and redundancy like the LONESTAR RAID.
64
65
h4. Object Stubs, Archiving Daemon, Ceph Archive System
66
67
Use energy aware placement strategies. Ceph is configured to provide placement and redundancy.
68
69
h3. Work items
70
71
h4. Coding tasks
72
73
# Task 1
74
# Task 2
75
# Task 3
76
77
h4. Build / release tasks
78
79
# Task 1
80
# Task 2
81
# Task 3
82
83
h4. Documentation tasks
84
85
# Task 1
86
# Task 2
87
# Task 3
88
89
h4. Deprecation tasks
90
91
# Task 1
92
# Task 2
93
# Task 3