Project

General

Profile

1D - RGW Geo-Replication and Disaster Recovery » History » Version 1

Jessica Mack, 06/22/2015 04:34 AM

1 1 Jessica Mack
h1. 1D - RGW Geo-Replication and Disaster Recovery
2
3
h3. Live Pad
4
5
The live pad can be found here: "[pad]":http://pad.ceph.com/p/RGW_Geo-Replication_and_Disaster_Recovery
6
7
h3. Summit Snapshot
8
9
Useful links:
10
11
p((.   http://wiki.ceph.com/01Planning/02Blueprints/Dumpling/RGW_Geo-Replication_and_Disaster_Recovery
12
13
Coding tasks
14
# enabling features (within the gateway)
15
** create regions
16
** new replication/placement attributes on buckets and objects
17
** additional logging of replication enabling information
18
** implement the log access/management APIs
19
** test suite for new log access/management APIs
20
** COPY across regions
21
# replication agent (free-standing application)
22
** track logs to identify changes
23
** propagate changes to secondary sites
24
** truncate no-longer-interesting logs
25
** test suite for update detection and propagation
26
# replication management APIs and console (free-standing application)
27
** definition of regions and zones
28
** management of bucket replication attributes
29
** enabling, disabling, monitoring of replication agent
30
** monitoring and reporting of replication status
31
32
33
Documentation tasks
34
# Document and review new RESTful APIs to access and manage the logs
35
# Documentation of the relevant log entries
36
# Document and review replication agent management interfaces
37
38
39
40
Questions for Yehuda:
41
    
42
p((.    Will there be support for something akin to Swift Container Synchronization, or
43
    Will Container Sync be supported directly for the Swift API? (see also http://goo.gl/4IZoi)
44
    Would you have to setup replication on a per-bucket basis or would it be possible to sync all buckets to a second cluster for DR (and swap masters) with a single config knob?
45
    Yehuda's reply: per-bucket/per-container granularity is part of the design, but the initial implementation will come with per-region granularity
46
47
    
48
p((.    Is there a plan to eventually refactor this in the context of a RADOS-level replication API (such that async replication could conceivably be extended to RBD, CephFS, and RADOS)?
49
    Yehuda's/Sage's/Greg's reply: this would involve a major overhaul of RADOS internals -- "don't hold your breath"
50
51
p((.    As an extension of that, what is the expected user difficulty for reversing the direction of replication in general? (This is one of the major pain points in GlusterFS geo-rep)
52
    -> not too hard.  conflict resolution will be simpler than glusterfs because of the simplified object model; something like newest version wins will work
53
    
54
p((.    How does this recover from erroreneous switches and being active on multiple sites, does that require a full resync?
55
    -> depends on how long the logs are, how long things are disconnected.
56
    
57
p((.    What would be the transport used for replication? HTTP(S)?
58
    -> it will be REST-based, so HTTP or HTTPS
59
60
p((.    Is the size of the log limited, and what happens when it overflows? (Because it's not pulled frequently enough, perhaps a temporary sync outage, ...)
61
    -> not yet decided.  option of (or combination of) size limit, and controlled by agents who are pulling
62
    Similarly, is the log replicated, or only on one gateway - what happens with the replication if that one fails?
63
    -> log is in rados, so it's durable
64
    
65
p((.    Please keep geo-dispersion of erasure-encoded object in mind while designing this.  Geo-dispersion of erasure-encoded objects is seen as a big cost reduction <hskinner>
66
    -> pretty please what about normal replc. sync with erasure-encoded back up
67
68
p((.   Will it scale for large amounts of containers and objects?
69
    -> for many containers, we shard the set of updated buckets
70
    -> for large containers, the bottleneck will be the bucket index/log... which will eventually be solved to also address the large container problem (by sharding the bucket index)