Version 1 - History - -1D - RGW Geo-Replication and Disaster Recovery - Ceph - Ceph

1

Jessica Mack

h1. -1D - RGW Geo-Replication and Disaster Recovery

2

3

<pre>

4

*** rturk is now known as rturk-away	09:32

5

*** davidzlap has joined #ceph-summit1	09:32

6

*** davidzlap has left #ceph-summit1	09:32

7

*** rturk-away is now known as rturk	09:33

8

scuttlemonkey	florian, want in to speak on this one?	09:33

9

gregaf	fghaas: ^	09:33

10

loicd	ccourtaut: is having troubles with his video connection	09:33

11

loicd	video / audio	09:34

12

fghaas	I'm here, yes :)	09:34

13

fghaas	it's ok for me to lurk though :)	09:34

14

scuttlemonkey	k	09:34

15

ccourtaut	i'm currently having problem between my isp/google services :/	09:34

16

*** wwformat has quit IRC	09:35

17

*** henrycc has quit IRC	09:35

18

loicd	http://wiki.ceph.com/01Planning/02Blueprints/Dumpling/RGW_Geo-Replication_and_Disaster_Recovery	09:36

19

loicd	http://pad.ceph.com/p/RADOS_Gateway_refactor	09:36

20

*** liwang has quit IRC	09:36

21

gregaf	loicd: hey, I finally have video of you and it's a beautiful hallway :P	09:38

22

*** yehuda_hm has quit IRC	09:39

23

lmb	Are the replicas consistent? (As in, they represent a valid, consistent state of the object at any given time, even if outdated)	09:39

24

*** yehuda_hm has joined #ceph-summit1	09:40

25

sagewk	lmb: yes	09:40

26

sagewk	the (replicated) updates are atomic	09:40

27

sagewk	but not necessarily ordered (across objects)	09:40

28

loicd	gregaf: there you see me for real ;-)	09:41

29

lmb	sagewk: aye	09:41

30

rturk	hi, loic!	09:41

31

loicd	:-D	09:41

32

loicd	it would be useful to have links to the http://tracker.ceph.com/projects/ceph issues related to http://wiki.ceph.com/01Planning/02Blueprints/Dumpling/RGW_Geo-Replication_and_Disaster_Recovery in the pad	09:45

33

dmick	loicd: add them!  :)	09:45

34

rturk	loicd: I'm working on a dekiwiki -> redmine plugin	09:45

35

rturk	so you'll be able to embed them the way you embed etherpads	09:45

36

rturk	as part of the post-summit work, I planned to share it so ppl can cross-link their work items	09:46

37

paravoid	they're not new	09:46

38

paravoid	but it doesn't really work	09:46

39

paravoid	I opened several bugs about it (incl. performance ones) and they said that they weren't really well designed	09:47

40

paravoid	they're implementing geo-replication themselves and they're not reusing this code for this	09:47

41

paravoid	that was on swift 1.7.x, I don't know how it has evolved since	09:48

42

joao	loicd, wave for us	09:48

43

scuttlemonkey	paravoid: so was the feeling you got that they were abandoning that code?	09:48

44

scuttlemonkey	or just not repurposing it for geo-replication	09:48

45

loicd	:-)	09:48

46

paravoid	I don't think they're abandoning it since it's there, but it's really for a different purpose from the get go	09:49

47

michael_dreamhost	the swift-sync seems like a higher level user controrable interface, rather than a cluster policy	09:49

48

*** liwang has joined #ceph-summit1	09:50

49

scuttlemonkey	makes sense	09:50

50

paravoid	the problem with not doing it in the rados layer is that we'll end up with having 6 copies for each object	09:50

51

joshd	fghaas: glance has their own plans for doing data transfer between regions	09:51

52

gregaf	paravoid: not necessarily, you could do 2+1 for instance	09:51

53

paravoid	well, not really	09:51

54

loicd	http://tracker.ceph.com/projects/rgw/search?utf8=%E2%9C%93&issues=1&q=multisite does that cover it ?	09:51

55

paravoid	because if I have two lost drives at the same time on one of the clusters	09:51

56

paravoid	(probable enough)	09:51

57

paravoid	the pg would be lost	09:52

58

paravoid	and radosgw replication won't recover it	09:52

59

hskinner	Please consider geo-dispersion of erasure-encoded objects as you design this	09:53

60

fghaas	gregaf: feel free to correct me if my summary of your reply is incorrect in the pad	09:53

61

gregaf	good enough for me, but I'm blunter than they are ;)	09:53

62

scuttlemonkey	hskinner: that would be a good thing to add to the etherpad as an "other" or something	09:54

63

scuttlemonkey	perhaps under coding tasks and yehuda can move it around where he wants	09:54

64

*** chee has quit IRC	09:54

65

fghaas	gregaf: better?	09:55

66

gregaf	lol, I did *not* say that	09:55

67

*** chee has joined #ceph-summit1	09:57

68

pioto	so, this geo replication being discussed is up at the rgw layer, and not, say, rados?	09:57

69

scuttlemonkey	pioto: yes	09:57

70

pioto	meaning, it won't be able to support cephfs or rbd, then? hm	09:58

71

scuttlemonkey	pioto: the hope is to eventually move it down the stack	09:58

72

pioto	ok	09:58

73

fghaas	pioto: I'm sure gregaf has a "blunt" comment for you ;)	09:58

74

scuttlemonkey	pioto: I think it's more about _using_ rgw to sync all data	09:58

75

lmb	scuttlemonkey: the "ain't gonna happen" comment on the pad? ;-)	09:58

76

scuttlemonkey	not limit it to rgw	09:58

77

nwl	pioto: the driver is disaster recovery. you can use incremental snapshots of rbd with export/import between clusters to do RBD DR	09:58

78

scuttlemonkey	but someone else can clarify	09:58

79

fghaas	lmb: I just attributed that one :)	09:58

80

pioto	nwl: yeah. i thin kthere's also whole-cluster snapshots i saw somewhere	09:59

81

loicd	what task would you advise a new coder willing to help to try to help with the implementation of geo replication ?	09:59

82

pioto	but i dunno if there was a good way to replicatte that directly to another cluster.. .think it wanted a whole giant local directory to export to first?	09:59

83

lmb	scuttlemonkey: access must go through rgw though, otherwise it can't log/journal, right? so you can't use rgw-replication on top of an arbitrary block object	09:59

84

fghaas	lmb: that is the plan, as it stands	10:00

85

paravoid	will it scale?	10:00

86

paravoid	:)	10:00

87

paravoid	for both large amount of containers and objects	10:01

88

lmb	fghaas: too bad, misses our use case, but clearly its a great first step	10:01

89

fghaas	lmb, I know -- surely there's plenty of folks that would love to see this in rbd	10:01

90

fghaas	which is what I guess you're relating to	10:02

91

paravoid	sagewk: ^	10:02

92

rturk	paravoid: I think there's a min or so delay w/the feed FYI	10:02

93

lmb	fghaas: yeah. similarly though, perhaps it could be implemented in the client-side libraries	10:02

94

fghaas	librbd I/O multiplex? joshd?	10:02

95

*** nwat has quit IRC	10:03

96

*** nwat has joined #ceph-summit1	10:04

97

*** nwat has quit IRC	10:04

98

</pre>

Project

General

Profile

Ceph

-1D - RGW Geo-Replication and Disaster Recovery » History » Version 1