Project

General

Profile

Deploying Ceph with Juju » History » Version 6

Jessica Mack, 06/07/2015 01:45 AM

1 1 Jessica Mack
h1. Deploying Ceph with Juju
2
3 6 Jessica Mack
{{toc}}
4
5 1 Jessica Mack
The last few weeks have been very exciting for Inktank and Ceph. There have been a number of community examples of how people are deploying or using Ceph in the wild. From the "ComodIT orchestration example":http://ceph.com/community/deploying-ceph-with-comodit/, to the unique approach of "Synnefo delivering unified storage":http://ceph.com/community/ceph-comes-to-synnefo-and-ganeti/ with Ceph and many others that haven't made it to the blog yet. It is a great time to be doing things with Ceph!
6
We at Inktank have been just as excited as anyone in the community and have been playing with a number of deployment and orchestration tools. Today I wanted to share an experiment of my own for the general consumption of the community, deploying Ceph with Canonical's relatively new deployment tool, "'Juju,'":https://jujucharms.com/ that is taking cloud deployments by storm. If you follow this guide to the end you should end up with something that looks like this:
7
8 5 Jessica Mack
!{width: 30%}http://ceph.com/wp-content/uploads/2012/11/2013-02-19_15-50-45.png!:http://ceph.com/wp-content/uploads/2012/11/2013-02-19_15-50-45.png
9 1 Jessica Mack
10
Juju is a ""next generation service deployment and orchestration framework"":https://jujucharms.com/docs/faq/. The cool part about Juju is you can use just about anything to build your Juju “charms” (recipes) from bash and your favorite scripting language, all the way up to Chef and Puppet. A good portion of the knowledge for the Ceph charms developed by Clint Byrum and James Page actually came from both the Chef cookbooks and the work on ceph-deploy, which we’ll cover in later installments.
11
For the purposes of this experiment I decided to build the environment using Amazon’s EC2 but you can also use an OpenStack deployment or on your own bare metal in conjunction with Canonical's "MAAS":https://maas.ubuntu.com/ product. The client machine used to spin up the bootstrap environment and then later spin up all the other servers will be an Ubuntu Quantal (12.10) LTS image, but could be any Ubuntu box, including your laptop. The rest of the working machines will be spun up using Quantal as well.
12
Juju is very generous about spinning up new boxes (typically one per service) so I chose to make all of my boxes spin up using the 't1.micro' machine size so anyone playing with this guide wouldn’t incur massive EC2 charges. Now, on to the meat!
13
14
h3. Getting Started
15
16
As I said, start the process by spinning up an Ubuntu 12.10 LTS image as your client, this way you don’t have to dump a bunch of software/config on your local machine. This will be the client you use to spin everything else up. Once you have your base Ubuntu install lets add the PPA and install Juju.
17
<pre>
18
> sudo apt-add-repository ppa:juju/pkgs
19
> sudo apt-get update && sudo apt-get install juju
20
</pre>
21
22
Now that we have Juju installed we need to tell it to generate a config file.
23
<pre>
24
> juju bootstrap
25
</pre>
26
27
This will throw an error, but creates ~/.juju/environments.yaml for you to edit. Since we’re using EC2 we need to tell Juju about our credentials so it can spin up new machines and deploy new services. You'll notice that I'm using the default-series of 'quantal' for all of my node machines. This is important since this tells juju where and how to grab the important bits of each charm.
28
<pre>
29
> vi ~/.juju/environments.yaml
30
31
default: cephtest
32
environments:
33
  cephtest:
34
    type: ec2
35
    access-key: YOUR-ACCESS-KEY-GOES-HERE
36
    secret-key: YOUR-SECRET-KEY-GOES-HERE
37
    control-bucket: (generated by juju)
38
    admin-secret: (generated by juju)
39
    default-series: quantal
40
    juju-origin: ppa
41
    ssl-hostname-verification: true
42
</pre>
43
44
h3. Setting up the Bootstrap Environment
45
46
Now that Juju can interact with EC2 directly we need to get a bootstrap environment set up that will hold our configs and deploy our services. Since I can’t set the global configs yet, I need to tell it manually that this box needs to be a 't1.micro' instance.
47
<pre>
48
> juju bootstrap --constraints “instance-type=t1.micro”
49
</pre>
50
51
This will take a few minutes to spin up the machine and get the environment set up. Once this is completed you should be able to see the machine via the 'juju status' command.
52
<pre>
53
> juju status
54
55
2012-11-07 13:06:30,645 INFO Connecting to environment...
56
2012-11-07 13:06:42,313 INFO Connected to environment.
57
machines:
58
  0:
59
    agent-state: running
60
    dns-name: ec2-23-20-70-201.compute-1.amazonaws.com
61
    instance-id: i-d79492ab
62
    instance-state: running
63
services: {}
64
2012-11-07 13:06:42,408 INFO 'status' command finished successfully
65
</pre>
66
67
Now we have a bootstrap environment and we can tell it that all boxes should default to 't1.micro' unless otherwise specified. There are a number of settings that you can monkey with, take a look at the constraints doc for more details.
68
<pre>
69
> juju set-constraints instance-type=t1.micro
70
</pre>
71
72
h3. Make it Pretty!
73
74
For those who like to see a visual representation of what's happening, or just feel like letting someone else watch what's going on, Juju now has a GUI that you can use. While I wouldn't recommend using the GUI as a replacement for the command line to deploy the charms below, you can certainly use it to watch what's happening. For more mature charms (and in the future) this GUI should be more than capable of managing your resources. In any case, it's neat to have pretty pictures as you tapdance on the CLI.
75
If you would like to install the GUI feel free to grab my version of the 'juju-gui' charm (at the time of this article the main charm wasn't on quantal yet):
76
<pre>
77
> juju deploy cs:~pmcgarry/quantal/juju-gui
78
</pre>
79
80
Once that completes (and it could take a while for everything to download and install) you'll need to 'expose' it so you can get to it:
81
<pre>
82
> juju expose juju-gui
83
</pre>
84
85
This will give you the ability to access the box publicly via a web browser at the ec2 address shown in 'juju status'. The detault user name and password are 'admin' and the 'admin-secret' value from your ~/.juju/environments.yaml file. Feel free to leave that up while you do the rest of this work to watch the magic happen.
86
87
h3. Prep for Ceph Deployment
88
89
Our Juju environment is now ready to start spinning up our Ceph cluster, we just need to do a little leg work so Juju has all the important details up-front. First we need to grab a few Ceph tools:
90
<pre>
91
> sudo apt-get install ceph-common && sudo apt-get install uuid
92
</pre>
93
94
We need to generate a uuid and auth key for Ceph to use.
95
<pre>
96
> uuid
97
</pre>
98
99
insert this as the $fsid below
100
<pre>
101
> ceph-authtool /dev/stdout --name=$NAME --gen-key
102
</pre>
103
104
insert this as the $monitor-secret below.
105
Now we need to drop these (and a few other) values into our yaml file:
106
<pre>
107
> vi ceph.yaml
108
109
ceph:
110
    source: http://ceph.com/debian-bobtail/ quantal main
111
    fsid: d78ae656-7476-11e2-a532-1231390a9d4b
112
    monitor-secret: AQDcNRlR6MMZNRAAWw3iAobsJ1MLoFBLJYo4yg==
113
114
ceph-osd:
115
    source: http://ceph.com/debian-bobtail/ quantal main
116
    osd-devices: /dev/xvdf
117
118
ceph-radosgw:
119
    source: http://ceph.com/debian-bobtail/ quantal main
120
</pre>
121
122
You'll notice we're also passing a 'source' item to Juju, this tells the charm where to grab the appropriate code for Ceph, in this case the "latest release":http://ceph.com/resources/downloads/ (Bobtail 0.56.3 when this was written) from Ceph.com.
123
124
h3. Tail Those Logs!
125
126
Since a good portion of this setup is experimental it's a good idea to tail the logs. Thankfully, Juju makes this extremely easy for you to do. Simply open a second term window, ssh to your client machine, and type:
127
<pre>
128
>juju debug-log
129
</pre>
130
131
This will aggregate all of the logs from your cluster into a single output for easy browsing in case something goes wrong.
132
Deploying Ceph Monitors
133
Time to start deploying our Ceph cluster! In this case we’re going to deploy the first three machines with ceph-mon (Ceph monitors) since we typically recommend at least three in order to reach a quorum. You'll want to wait until all three machines are up before moving on.
134
<pre>
135
> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph
136
</pre>
137
138
You’ll notice that while these charms are in the charm store (cs:) they are off on my own user space. This is because I had to make a few tweaky changes for these charms to deploy happily on ec2 and use bobtail and quantal. These charms are still a bit new so if you have tweaks or changes feel free to give me a shout, or play with the main Ceph charms on jujucharms.com. In the future you’ll be able to deploy using just ‘ceph’ instead of anyone's user space.
139
<pre>
140
EXAMPLE: > juju deploy -n 3 --config ceph.yaml ceph
141
</pre>
142
143
This could take a while, so just keep checking 'juju status' until you have the machines running AND the agents set to 'started.' You should also see the debug-log go through a flurry of activity when it starts getting close to the end.
144
Once we have the monitors up and running you can take a look at what your deployment looks like. If you want to you can even ssh in to one of the machines using Juju’s built-in ssh tool.
145
<pre>
146
> juju status
147
148
machines:
149
  0:
150
    agent-state: running
151
    dns-name: ec2-50-16-15-64.compute-1.amazonaws.com
152
    instance-id: i-2b45f657
153
    instance-state: running
154
  1:
155
    agent-state: running
156
    dns-name: ec2-50-19-23-167.compute-1.amazonaws.com
157
    instance-id: i-3b368547
158
    instance-state: running
159
  2:
160
    agent-state: running
161
    dns-name: ec2-107-22-128-107.compute-1.amazonaws.com
162
    instance-id: i-1f368563
163
    instance-state: running
164
  3:
165
    agent-state: running
166
    dns-name: ec2-174-129-51-96.compute-1.amazonaws.com
167
    instance-id: i-15368569
168
    instance-state: running
169
services:
170
  ceph:
171
    charm: cs:~pmcgarry/quantal/ceph-0
172
    relations:
173
      mon:
174
      - ceph
175
    units:
176
      ceph/0:
177
        agent-state: started
178
        machine: 1
179
        public-address: ec2-50-19-23-167.compute-1.amazonaws.com
180
      ceph/1:
181
        agent-state: started
182
        machine: 2
183
        public-address: ec2-107-22-128-107.compute-1.amazonaws.com
184
      ceph/2:
185
        agent-state: started
186
        machine: 3
187
        public-address: ec2-174-129-51-96.compute-1.amazonaws.com
188
</pre>
189
190
<pre>
191
> juju ssh ceph/0 sudo ceph -s
192
193
   health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
194
   monmap e2: 3 mons at {ceph232118103=10.243.121.227:6789/0,ceph501969115=10.245.210.114:6789/0,ceph5423414494=10.245.89.32:6789/0}, election epoch 6, quorum 0,1,2 ceph232118103,ceph501969115,ceph5423414494
195
   osdmap e1: 0 osds: 0 up, 0 in
196
    pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail
197
   mdsmap e1: 0/0/1 up
198
</pre>
199
200
From this status we can see that there are three monitors up ("monmap e2: 3 mons at {...}") and no OSDs ("osdmap e1: 0 osds: 0 up, 0 in"). Time to spin up some homes for those bits!
201
202
h3. Deploying OSDs
203
204
Once our monitors look healthy it’s time to spin up some OSDs. Feel free to drop as many in as you please, for the purposes of this experiment I chose to spin up three.
205
<pre>
206
> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph-osd
207
</pre>
208
209
That will take a little bit to complete so you may want to go grab an infusion of caffeine at this point. One thing to keep in mind is that earlier in our ceph.yaml we defined the physical devices for our OSDs as /dev/xvdf. If you are familiar with EC2 you will know that that device doesn't exist yet, so our OSD deploy command will spin up and configure boxes, but we're not quite there yet.
210
When you get back, if you take a look with juju status you should now see a bunch of new machines and a new section called ceph-osd:
211
<pre>
212
> juju status
213
214
215
  ceph-osd:
216
    charm: cs:~pmcgarry/quantal/ceph-osd-0
217
    relations: {}
218
    units:
219
      ceph-osd/0:
220
        agent-state: started
221
        machine: 4
222
        public-address: ec2-174-129-82-169.compute-1.amazonaws.com
223
      ceph-osd/1:
224
        agent-state: started
225
        machine: 5
226
        public-address: ec2-50-16-0-95.compute-1.amazonaws.com
227
      ceph-osd/2:
228
        agent-state: started
229
        machine: 6
230
        public-address: ec2-75-101-175-213.compute-1.amazonaws.com
231
</pre>
232
    
233
Now we need to actually give it the disks it needs. Via your EC2 console (or using ec2 command line tools) you need to spin up 3 EBS volumes and attach one to each of your OSD machines. If you need help there is a pretty decent, concise walkthrough at:
234
http://www.webmastersessions.com/how-to-attach-ebs-volume-to-amazon-ec2-instance
235
Once you have the volumes attached we need to tell Juju to go back and use them:
236
<pre>
237
> juju set ceph-osd "osd-devices=/dev/xvdf"
238
</pre>
239
240
This will trigger a rescan and get your OSDs functioning. All that’s left now is to connect our monitor cluster with the new pool of OSDs.
241
<pre>
242
> juju add-relation ceph-osd ceph
243
</pre>
244
245
We can ssh into one of the Ceph boxes and take a look at our cluster now:
246
<pre>
247
>juju ssh ceph/0
248
249
> sudo ceph -s
250
251
   health HEALTH_OK
252
   monmap e2: 3 mons at {ceph232118103=10.243.121.227:6789/0,ceph501969115=10.245.210.114:6789/0,ceph5423414494=10.245.89.32:6789/0}, election epoch 6, quorum 0,1,2 ceph232118103,ceph501969115,ceph5423414494
253
   osdmap e10: 3 osds: 3 up, 3 in
254
    pgmap v115: 208 pgs: 208 active+clean; 0 bytes data, 3102 MB used, 27584 MB / 30686 MB avail
255
   mdsmap e1: 0/0/1 up
256
</pre>
257
258
Congratulations, you now have a Ceph cluster! Feel free to write a few apps against it, show it to all of your friends, or just nuke it and start refining your chops for a production deployment.
259
260
h3. Extra Credit
261
262
Since that Juju GUI screen looked so empty I decided I wanted to play a bit more with the tools at my disposal. If you would like to take this exercise a bit further we can also add a few RADOS Gateway machines and load-balance them behind an haproxy machine. To do this is only a few more commands with Juju:
263
<pre>
264
> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph-radosgw
265
> juju expose ceph-radosgw
266
267
> juju deploy cs:~pmcgarry/quantal/haproxy
268
> juju expose haproxy
269
270
> juju add-relation ceph-radosgw haproxy
271
</pre>
272
273
That should be it! You'll notice that I have my own copy of the haproxy, this is simply because it isn't technically released for quantal yet, but my (unmodified) version seems to run just fine.
274
275
h3. Troubleshooting
276
277
Juju actually makes troubleshooting and iterative development VERY easy (one of my favorite things about it). If you would like to delve deeper into playing with Juju I highly recommend reading their docs, which are quite good. However, one of the most useful tools (beyond the debug-log I mentioned earlier) is the ability to step through the hooks as juju tries to run them. For example, lets say we tried to deploy Ceph and 'juju status' was telling us there was an 'install-error.' We could use our second term window to execute the following:
278
<pre>
279
> juju debug-hooks ceph/0
280
</pre>
281
282
This allows us to debug the execution of the hooks on a specific machine (in this case ceph/0). Now in our main window we can type:
283
<pre>
284
> juju resolved --retry ceph/0
285
</pre>
286
287
We get a preformatted setup in our 'debug-hooks' window with an indication at the bottom that we're on the "install" hook. From here we can change to the hooks directory and rerun the install hook:
288
<pre>
289
> cd hooks
290
> ./install
291
</pre>
292
293
From here we can troubleshoot errors on this box before going back and pushing a patch to Launchpad.net. I wont try to recreate the expansive documentation on the jujucharms site, but fiddling with Juju has been far less frustrating that some other orchestration frameworks I have poked at recently. Good luck, and happy charming!
294
295
h3. Cleaning Up
296
297
If you would like to close up shop you can either destroy just the services (if you want to keep the machines running for deploying other Juju tests):
298
<pre>
299
> juju destroy-service ceph
300
> juju destroy-service ceph-osd
301
> juju destroy-service ceph-radosgw
302
> juju destroy-service haproxy
303
</pre>
304
305
...or just drop some dynamite on the whole thing (this will kill everything but your client machine, including your bootstrap environment):
306
<pre>
307
> juju destroy-environment
308
</pre>
309
310
h3. Wrap Up
311
312
You are now a seasoned veteran of Ceph deployment, what more could you want? If you do have questions, comments, or anything for the good of the cause we would love to hear about it. Currently the best way to get help or give feedback is in our "#Ceph irc channel":irc://irc.oftc.net:6667/ceph but our "mailing lists":http://ceph.com/resources/mailing-list-irc/ are also pretty active. For Juju-specific feedback you can also hit up the "#Juju irc channel":irc://irc.freenode.net:6667/juju. If you see any egregious errors on this writeup or would like to know more about Ceph community plans feel free to send email to patrick at inktank dot com.
313
 
314
 
315
<pre>
316
scuttlemonkey out
317
</pre>