Project

General

Profile

Hadoop over Ceph RGW status update » History » Version 1

Yuan Zhou, 06/10/2015 01:28 PM

1 1 Yuan Zhou
h1. Hadoop over Ceph RGW status update
2
3
h2. Summary
4
The goal is to write a Hadoop Compatible Filesystem(RGWFS) to allow Hadoop run over RGW. We also want to add load balancing feature to allow this scale to some rack rachitecture
5
6
h2. Owners
7
Yuan Zhou(Intel)
8
Jian Zhang(Intel)
9
10
11
h2. Interested Parties
12
If you are interested in contributing to this blueprint, or want to be a "speaker" during the Summit session, list your name here.
13
Name (Affiliation)
14
Name (Affiliation)
15
Name
16
17
h2. Current Status
18
In Infernalis we proposed this BP(Hadoop over Ceph Radosgw with SSD cache). During the last several months, we've got some progress.
19
h3. RGWFS
20
Thanks to SwiftFS, RGWFS is able to reuse lots of code. Currently the general code path is done. We're able to read/write with Hadoop command line tool through RGWFS, which talks to the backend Rados cluster.
21
22
h3. RGW-Proxy
23
We have implented a simple WSGI server that can give out the nearest RGW instance by looking through the internal data mapping in the Rados cluster. By giving the object name, RGW-Proxy would query in the cluster to check the mapping of data(ceph osd map obj_name), and then give out corresponding RGW instance
24
25
26
h2. Detailed Description
27
There're a few things we're working on. 
28
h3. Make RGWFS work with multiple RGW instance
29
h3. Performance testing
30
31
h2. Work items
32
This section should contain a list of work tasks created by this blueprint.  Please include engineering tasks as well as related build/release and documentation work.  If this blueprint requires cleanup of deprecated features, please list those tasks as well.
33
34
h2. Coding tasks
35
Task 1
36
Task 2
37
Task 3
38
39
h2. Build / release tasks
40
Task 1
41
Task 2
42
Task 3
43
44
h2. Documentation tasks
45
Task 1
46
Task 2
47
Task 3
48
49
h2. Deprecation tasks
50
Task 1
51
Task 2
52
Task 3