Project

General

Profile

Deploying Ceph with Juju » History » Version 3

Jessica Mack, 06/06/2015 11:50 PM

1 1 Jessica Mack
h1. Deploying Ceph with Juju
2
3
The last few weeks have been very exciting for Inktank and Ceph. There have been a number of community examples of how people are deploying or using Ceph in the wild. From the "ComodIT orchestration example":http://ceph.com/community/deploying-ceph-with-comodit/, to the unique approach of "Synnefo delivering unified storage":http://ceph.com/community/ceph-comes-to-synnefo-and-ganeti/ with Ceph and many others that haven't made it to the blog yet. It is a great time to be doing things with Ceph!
4
We at Inktank have been just as excited as anyone in the community and have been playing with a number of deployment and orchestration tools. Today I wanted to share an experiment of my own for the general consumption of the community, deploying Ceph with Canonical's relatively new deployment tool, "'Juju,'":https://jujucharms.com/ that is taking cloud deployments by storm. If you follow this guide to the end you should end up with something that looks like this:
5
6 3 Jessica Mack
!{width: 10%}2013-02-19_15-50-45.png!
7 1 Jessica Mack
8
Juju is a ""next generation service deployment and orchestration framework"":https://jujucharms.com/docs/faq/. The cool part about Juju is you can use just about anything to build your Juju “charms” (recipes) from bash and your favorite scripting language, all the way up to Chef and Puppet. A good portion of the knowledge for the Ceph charms developed by Clint Byrum and James Page actually came from both the Chef cookbooks and the work on ceph-deploy, which we’ll cover in later installments.
9
For the purposes of this experiment I decided to build the environment using Amazon’s EC2 but you can also use an OpenStack deployment or on your own bare metal in conjunction with Canonical's "MAAS":https://maas.ubuntu.com/ product. The client machine used to spin up the bootstrap environment and then later spin up all the other servers will be an Ubuntu Quantal (12.10) LTS image, but could be any Ubuntu box, including your laptop. The rest of the working machines will be spun up using Quantal as well.
10
Juju is very generous about spinning up new boxes (typically one per service) so I chose to make all of my boxes spin up using the 't1.micro' machine size so anyone playing with this guide wouldn’t incur massive EC2 charges. Now, on to the meat!
11
12
h3. Getting Started
13
14
As I said, start the process by spinning up an Ubuntu 12.10 LTS image as your client, this way you don’t have to dump a bunch of software/config on your local machine. This will be the client you use to spin everything else up. Once you have your base Ubuntu install lets add the PPA and install Juju.
15
<pre>
16
> sudo apt-add-repository ppa:juju/pkgs
17
> sudo apt-get update && sudo apt-get install juju
18
</pre>
19
20
Now that we have Juju installed we need to tell it to generate a config file.
21
<pre>
22
> juju bootstrap
23
</pre>
24
25
This will throw an error, but creates ~/.juju/environments.yaml for you to edit. Since we’re using EC2 we need to tell Juju about our credentials so it can spin up new machines and deploy new services. You'll notice that I'm using the default-series of 'quantal' for all of my node machines. This is important since this tells juju where and how to grab the important bits of each charm.
26
<pre>
27
> vi ~/.juju/environments.yaml
28
29
default: cephtest
30
environments:
31
  cephtest:
32
    type: ec2
33
    access-key: YOUR-ACCESS-KEY-GOES-HERE
34
    secret-key: YOUR-SECRET-KEY-GOES-HERE
35
    control-bucket: (generated by juju)
36
    admin-secret: (generated by juju)
37
    default-series: quantal
38
    juju-origin: ppa
39
    ssl-hostname-verification: true
40
</pre>
41
42
h3. Setting up the Bootstrap Environment
43
44
Now that Juju can interact with EC2 directly we need to get a bootstrap environment set up that will hold our configs and deploy our services. Since I can’t set the global configs yet, I need to tell it manually that this box needs to be a 't1.micro' instance.
45
<pre>
46
> juju bootstrap --constraints “instance-type=t1.micro”
47
</pre>
48
49
This will take a few minutes to spin up the machine and get the environment set up. Once this is completed you should be able to see the machine via the 'juju status' command.
50
<pre>
51
> juju status
52
53
2012-11-07 13:06:30,645 INFO Connecting to environment...
54
2012-11-07 13:06:42,313 INFO Connected to environment.
55
machines:
56
  0:
57
    agent-state: running
58
    dns-name: ec2-23-20-70-201.compute-1.amazonaws.com
59
    instance-id: i-d79492ab
60
    instance-state: running
61
services: {}
62
2012-11-07 13:06:42,408 INFO 'status' command finished successfully
63
</pre>
64
65
Now we have a bootstrap environment and we can tell it that all boxes should default to 't1.micro' unless otherwise specified. There are a number of settings that you can monkey with, take a look at the constraints doc for more details.
66
<pre>
67
> juju set-constraints instance-type=t1.micro
68
</pre>
69
70
h3. Make it Pretty!
71
72
For those who like to see a visual representation of what's happening, or just feel like letting someone else watch what's going on, Juju now has a GUI that you can use. While I wouldn't recommend using the GUI as a replacement for the command line to deploy the charms below, you can certainly use it to watch what's happening. For more mature charms (and in the future) this GUI should be more than capable of managing your resources. In any case, it's neat to have pretty pictures as you tapdance on the CLI.
73
If you would like to install the GUI feel free to grab my version of the 'juju-gui' charm (at the time of this article the main charm wasn't on quantal yet):
74
<pre>
75
> juju deploy cs:~pmcgarry/quantal/juju-gui
76
</pre>
77
78
Once that completes (and it could take a while for everything to download and install) you'll need to 'expose' it so you can get to it:
79
<pre>
80
> juju expose juju-gui
81
</pre>
82
83
This will give you the ability to access the box publicly via a web browser at the ec2 address shown in 'juju status'. The detault user name and password are 'admin' and the 'admin-secret' value from your ~/.juju/environments.yaml file. Feel free to leave that up while you do the rest of this work to watch the magic happen.
84
85
h3. Prep for Ceph Deployment
86
87
Our Juju environment is now ready to start spinning up our Ceph cluster, we just need to do a little leg work so Juju has all the important details up-front. First we need to grab a few Ceph tools:
88
<pre>
89
> sudo apt-get install ceph-common && sudo apt-get install uuid
90
</pre>
91
92
We need to generate a uuid and auth key for Ceph to use.
93
<pre>
94
> uuid
95
</pre>
96
97
insert this as the $fsid below
98
<pre>
99
> ceph-authtool /dev/stdout --name=$NAME --gen-key
100
</pre>
101
102
insert this as the $monitor-secret below.
103
Now we need to drop these (and a few other) values into our yaml file:
104
<pre>
105
> vi ceph.yaml
106
107
ceph:
108
    source: http://ceph.com/debian-bobtail/ quantal main
109
    fsid: d78ae656-7476-11e2-a532-1231390a9d4b
110
    monitor-secret: AQDcNRlR6MMZNRAAWw3iAobsJ1MLoFBLJYo4yg==
111
112
ceph-osd:
113
    source: http://ceph.com/debian-bobtail/ quantal main
114
    osd-devices: /dev/xvdf
115
116
ceph-radosgw:
117
    source: http://ceph.com/debian-bobtail/ quantal main
118
</pre>
119
120
You'll notice we're also passing a 'source' item to Juju, this tells the charm where to grab the appropriate code for Ceph, in this case the "latest release":http://ceph.com/resources/downloads/ (Bobtail 0.56.3 when this was written) from Ceph.com.
121
122
h3. Tail Those Logs!
123
124
Since a good portion of this setup is experimental it's a good idea to tail the logs. Thankfully, Juju makes this extremely easy for you to do. Simply open a second term window, ssh to your client machine, and type:
125
<pre>
126
>juju debug-log
127
</pre>
128
129
This will aggregate all of the logs from your cluster into a single output for easy browsing in case something goes wrong.
130
Deploying Ceph Monitors
131
Time to start deploying our Ceph cluster! In this case we’re going to deploy the first three machines with ceph-mon (Ceph monitors) since we typically recommend at least three in order to reach a quorum. You'll want to wait until all three machines are up before moving on.
132
<pre>
133
> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph
134
</pre>
135
136
You’ll notice that while these charms are in the charm store (cs:) they are off on my own user space. This is because I had to make a few tweaky changes for these charms to deploy happily on ec2 and use bobtail and quantal. These charms are still a bit new so if you have tweaks or changes feel free to give me a shout, or play with the main Ceph charms on jujucharms.com. In the future you’ll be able to deploy using just ‘ceph’ instead of anyone's user space.
137
<pre>
138
EXAMPLE: > juju deploy -n 3 --config ceph.yaml ceph
139
</pre>
140
141
This could take a while, so just keep checking 'juju status' until you have the machines running AND the agents set to 'started.' You should also see the debug-log go through a flurry of activity when it starts getting close to the end.
142
Once we have the monitors up and running you can take a look at what your deployment looks like. If you want to you can even ssh in to one of the machines using Juju’s built-in ssh tool.
143
<pre>
144
> juju status
145
146
machines:
147
  0:
148
    agent-state: running
149
    dns-name: ec2-50-16-15-64.compute-1.amazonaws.com
150
    instance-id: i-2b45f657
151
    instance-state: running
152
  1:
153
    agent-state: running
154
    dns-name: ec2-50-19-23-167.compute-1.amazonaws.com
155
    instance-id: i-3b368547
156
    instance-state: running
157
  2:
158
    agent-state: running
159
    dns-name: ec2-107-22-128-107.compute-1.amazonaws.com
160
    instance-id: i-1f368563
161
    instance-state: running
162
  3:
163
    agent-state: running
164
    dns-name: ec2-174-129-51-96.compute-1.amazonaws.com
165
    instance-id: i-15368569
166
    instance-state: running
167
services:
168
  ceph:
169
    charm: cs:~pmcgarry/quantal/ceph-0
170
    relations:
171
      mon:
172
      - ceph
173
    units:
174
      ceph/0:
175
        agent-state: started
176
        machine: 1
177
        public-address: ec2-50-19-23-167.compute-1.amazonaws.com
178
      ceph/1:
179
        agent-state: started
180
        machine: 2
181
        public-address: ec2-107-22-128-107.compute-1.amazonaws.com
182
      ceph/2:
183
        agent-state: started
184
        machine: 3
185
        public-address: ec2-174-129-51-96.compute-1.amazonaws.com
186
</pre>
187
188
<pre>
189
> juju ssh ceph/0 sudo ceph -s
190
191
   health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
192
   monmap e2: 3 mons at {ceph232118103=10.243.121.227:6789/0,ceph501969115=10.245.210.114:6789/0,ceph5423414494=10.245.89.32:6789/0}, election epoch 6, quorum 0,1,2 ceph232118103,ceph501969115,ceph5423414494
193
   osdmap e1: 0 osds: 0 up, 0 in
194
    pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail
195
   mdsmap e1: 0/0/1 up
196
</pre>
197
198
From this status we can see that there are three monitors up ("monmap e2: 3 mons at {...}") and no OSDs ("osdmap e1: 0 osds: 0 up, 0 in"). Time to spin up some homes for those bits!
199
200
h3. Deploying OSDs
201
202
Once our monitors look healthy it’s time to spin up some OSDs. Feel free to drop as many in as you please, for the purposes of this experiment I chose to spin up three.
203
<pre>
204
> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph-osd
205
</pre>
206
207
That will take a little bit to complete so you may want to go grab an infusion of caffeine at this point. One thing to keep in mind is that earlier in our ceph.yaml we defined the physical devices for our OSDs as /dev/xvdf. If you are familiar with EC2 you will know that that device doesn't exist yet, so our OSD deploy command will spin up and configure boxes, but we're not quite there yet.
208
When you get back, if you take a look with juju status you should now see a bunch of new machines and a new section called ceph-osd:
209
<pre>
210
> juju status
211
212
213
  ceph-osd:
214
    charm: cs:~pmcgarry/quantal/ceph-osd-0
215
    relations: {}
216
    units:
217
      ceph-osd/0:
218
        agent-state: started
219
        machine: 4
220
        public-address: ec2-174-129-82-169.compute-1.amazonaws.com
221
      ceph-osd/1:
222
        agent-state: started
223
        machine: 5
224
        public-address: ec2-50-16-0-95.compute-1.amazonaws.com
225
      ceph-osd/2:
226
        agent-state: started
227
        machine: 6
228
        public-address: ec2-75-101-175-213.compute-1.amazonaws.com
229
</pre>
230
    
231
Now we need to actually give it the disks it needs. Via your EC2 console (or using ec2 command line tools) you need to spin up 3 EBS volumes and attach one to each of your OSD machines. If you need help there is a pretty decent, concise walkthrough at:
232
http://www.webmastersessions.com/how-to-attach-ebs-volume-to-amazon-ec2-instance
233
Once you have the volumes attached we need to tell Juju to go back and use them:
234
<pre>
235
> juju set ceph-osd "osd-devices=/dev/xvdf"
236
</pre>
237
238
This will trigger a rescan and get your OSDs functioning. All that’s left now is to connect our monitor cluster with the new pool of OSDs.
239
<pre>
240
> juju add-relation ceph-osd ceph
241
</pre>
242
243
We can ssh into one of the Ceph boxes and take a look at our cluster now:
244
<pre>
245
>juju ssh ceph/0
246
247
> sudo ceph -s
248
249
   health HEALTH_OK
250
   monmap e2: 3 mons at {ceph232118103=10.243.121.227:6789/0,ceph501969115=10.245.210.114:6789/0,ceph5423414494=10.245.89.32:6789/0}, election epoch 6, quorum 0,1,2 ceph232118103,ceph501969115,ceph5423414494
251
   osdmap e10: 3 osds: 3 up, 3 in
252
    pgmap v115: 208 pgs: 208 active+clean; 0 bytes data, 3102 MB used, 27584 MB / 30686 MB avail
253
   mdsmap e1: 0/0/1 up
254
</pre>
255
256
Congratulations, you now have a Ceph cluster! Feel free to write a few apps against it, show it to all of your friends, or just nuke it and start refining your chops for a production deployment.
257
258
h3. Extra Credit
259
260
Since that Juju GUI screen looked so empty I decided I wanted to play a bit more with the tools at my disposal. If you would like to take this exercise a bit further we can also add a few RADOS Gateway machines and load-balance them behind an haproxy machine. To do this is only a few more commands with Juju:
261
<pre>
262
> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph-radosgw
263
> juju expose ceph-radosgw
264
265
> juju deploy cs:~pmcgarry/quantal/haproxy
266
> juju expose haproxy
267
268
> juju add-relation ceph-radosgw haproxy
269
</pre>
270
271
That should be it! You'll notice that I have my own copy of the haproxy, this is simply because it isn't technically released for quantal yet, but my (unmodified) version seems to run just fine.
272
273
h3. Troubleshooting
274
275
Juju actually makes troubleshooting and iterative development VERY easy (one of my favorite things about it). If you would like to delve deeper into playing with Juju I highly recommend reading their docs, which are quite good. However, one of the most useful tools (beyond the debug-log I mentioned earlier) is the ability to step through the hooks as juju tries to run them. For example, lets say we tried to deploy Ceph and 'juju status' was telling us there was an 'install-error.' We could use our second term window to execute the following:
276
<pre>
277
> juju debug-hooks ceph/0
278
</pre>
279
280
This allows us to debug the execution of the hooks on a specific machine (in this case ceph/0). Now in our main window we can type:
281
<pre>
282
> juju resolved --retry ceph/0
283
</pre>
284
285
We get a preformatted setup in our 'debug-hooks' window with an indication at the bottom that we're on the "install" hook. From here we can change to the hooks directory and rerun the install hook:
286
<pre>
287
> cd hooks
288
> ./install
289
</pre>
290
291
From here we can troubleshoot errors on this box before going back and pushing a patch to Launchpad.net. I wont try to recreate the expansive documentation on the jujucharms site, but fiddling with Juju has been far less frustrating that some other orchestration frameworks I have poked at recently. Good luck, and happy charming!
292
293
h3. Cleaning Up
294
295
If you would like to close up shop you can either destroy just the services (if you want to keep the machines running for deploying other Juju tests):
296
<pre>
297
> juju destroy-service ceph
298
> juju destroy-service ceph-osd
299
> juju destroy-service ceph-radosgw
300
> juju destroy-service haproxy
301
</pre>
302
303
...or just drop some dynamite on the whole thing (this will kill everything but your client machine, including your bootstrap environment):
304
<pre>
305
> juju destroy-environment
306
</pre>
307
308
h3. Wrap Up
309
310
You are now a seasoned veteran of Ceph deployment, what more could you want? If you do have questions, comments, or anything for the good of the cause we would love to hear about it. Currently the best way to get help or give feedback is in our "#Ceph irc channel":irc://irc.oftc.net:6667/ceph but our "mailing lists":http://ceph.com/resources/mailing-list-irc/ are also pretty active. For Juju-specific feedback you can also hit up the "#Juju irc channel":irc://irc.freenode.net:6667/juju. If you see any egregious errors on this writeup or would like to know more about Ceph community plans feel free to send email to patrick at inktank dot com.
311
 
312
 
313
<pre>
314
scuttlemonkey out
315
</pre>