Version 3 - History - Deploying Ceph with Juju - Ceph - Ceph

1

Jessica Mack

h1. Deploying Ceph with Juju

2

3

The last few weeks have been very exciting for Inktank and Ceph. There have been a number of community examples of how people are deploying or using Ceph in the wild. From the "ComodIT orchestration example":http://ceph.com/community/deploying-ceph-with-comodit/, to the unique approach of "Synnefo delivering unified storage":http://ceph.com/community/ceph-comes-to-synnefo-and-ganeti/ with Ceph and many others that haven't made it to the blog yet. It is a great time to be doing things with Ceph!

4

We at Inktank have been just as excited as anyone in the community and have been playing with a number of deployment and orchestration tools. Today I wanted to share an experiment of my own for the general consumption of the community, deploying Ceph with Canonical's relatively new deployment tool, "'Juju,'":https://jujucharms.com/ that is taking cloud deployments by storm. If you follow this guide to the end you should end up with something that looks like this:

5

7

1

Jessica Mack

8

Juju is a ""next generation service deployment and orchestration framework"":https://jujucharms.com/docs/faq/. The cool part about Juju is you can use just about anything to build your Juju “charms” (recipes) from bash and your favorite scripting language, all the way up to Chef and Puppet. A good portion of the knowledge for the Ceph charms developed by Clint Byrum and James Page actually came from both the Chef cookbooks and the work on ceph-deploy, which we’ll cover in later installments.

9

For the purposes of this experiment I decided to build the environment using Amazon’s EC2 but you can also use an OpenStack deployment or on your own bare metal in conjunction with Canonical's "MAAS":https://maas.ubuntu.com/ product. The client machine used to spin up the bootstrap environment and then later spin up all the other servers will be an Ubuntu Quantal (12.10) LTS image, but could be any Ubuntu box, including your laptop. The rest of the working machines will be spun up using Quantal as well.

10

Juju is very generous about spinning up new boxes (typically one per service) so I chose to make all of my boxes spin up using the 't1.micro' machine size so anyone playing with this guide wouldn’t incur massive EC2 charges. Now, on to the meat!

11

12

h3. Getting Started

13

14

As I said, start the process by spinning up an Ubuntu 12.10 LTS image as your client, this way you don’t have to dump a bunch of software/config on your local machine. This will be the client you use to spin everything else up. Once you have your base Ubuntu install lets add the PPA and install Juju.

15

<pre>

16

> sudo apt-add-repository ppa:juju/pkgs

17

> sudo apt-get update && sudo apt-get install juju

18

</pre>

19

20

Now that we have Juju installed we need to tell it to generate a config file.

21

<pre>

22

> juju bootstrap

23

</pre>

24

25

This will throw an error, but creates ~/.juju/environments.yaml for you to edit. Since we’re using EC2 we need to tell Juju about our credentials so it can spin up new machines and deploy new services. You'll notice that I'm using the default-series of 'quantal' for all of my node machines. This is important since this tells juju where and how to grab the important bits of each charm.

26

<pre>

27

> vi ~/.juju/environments.yaml

28

29

default: cephtest

30

environments:

31

  cephtest:

32

    type: ec2

33

    access-key: YOUR-ACCESS-KEY-GOES-HERE

34

    secret-key: YOUR-SECRET-KEY-GOES-HERE

35

    control-bucket: (generated by juju)

36

    admin-secret: (generated by juju)

37

    default-series: quantal

38

    juju-origin: ppa

39

    ssl-hostname-verification: true

40

</pre>

41

42

h3. Setting up the Bootstrap Environment

43

44

Now that Juju can interact with EC2 directly we need to get a bootstrap environment set up that will hold our configs and deploy our services. Since I can’t set the global configs yet, I need to tell it manually that this box needs to be a 't1.micro' instance.

45

<pre>

46

> juju bootstrap --constraints “instance-type=t1.micro”

47

</pre>

48

49

This will take a few minutes to spin up the machine and get the environment set up. Once this is completed you should be able to see the machine via the 'juju status' command.

50

<pre>

51

> juju status

52

53

2012-11-07 13:06:30,645 INFO Connecting to environment...

54

2012-11-07 13:06:42,313 INFO Connected to environment.

55

machines:

56

0:

57

    agent-state: running

58

    dns-name: ec2-23-20-70-201.compute-1.amazonaws.com

59

    instance-id: i-d79492ab

60

    instance-state: running

61

services: {}

62

2012-11-07 13:06:42,408 INFO 'status' command finished successfully

63

</pre>

64

65

Now we have a bootstrap environment and we can tell it that all boxes should default to 't1.micro' unless otherwise specified. There are a number of settings that you can monkey with, take a look at the constraints doc for more details.

66

<pre>

67

> juju set-constraints instance-type=t1.micro

68

</pre>

69

70

h3. Make it Pretty!

71

72

For those who like to see a visual representation of what's happening, or just feel like letting someone else watch what's going on, Juju now has a GUI that you can use. While I wouldn't recommend using the GUI as a replacement for the command line to deploy the charms below, you can certainly use it to watch what's happening. For more mature charms (and in the future) this GUI should be more than capable of managing your resources. In any case, it's neat to have pretty pictures as you tapdance on the CLI.

73

If you would like to install the GUI feel free to grab my version of the 'juju-gui' charm (at the time of this article the main charm wasn't on quantal yet):

74

<pre>

75

> juju deploy cs:~pmcgarry/quantal/juju-gui

76

</pre>

77

78

Once that completes (and it could take a while for everything to download and install) you'll need to 'expose' it so you can get to it:

79

<pre>

80

> juju expose juju-gui

81

</pre>

82

83

This will give you the ability to access the box publicly via a web browser at the ec2 address shown in 'juju status'. The detault user name and password are 'admin' and the 'admin-secret' value from your ~/.juju/environments.yaml file. Feel free to leave that up while you do the rest of this work to watch the magic happen.

84

85

h3. Prep for Ceph Deployment

86

87

Our Juju environment is now ready to start spinning up our Ceph cluster, we just need to do a little leg work so Juju has all the important details up-front. First we need to grab a few Ceph tools:

88

<pre>

89

> sudo apt-get install ceph-common && sudo apt-get install uuid

90

</pre>

91

92

We need to generate a uuid and auth key for Ceph to use.

93

<pre>

94

> uuid

95

</pre>

96

97

insert this as the $fsid below

98

<pre>

99

> ceph-authtool /dev/stdout --name=$NAME --gen-key

100

</pre>

101

102

insert this as the $monitor-secret below.

103

Now we need to drop these (and a few other) values into our yaml file:

104

<pre>

105

> vi ceph.yaml

106

107

ceph:

108

    source: http://ceph.com/debian-bobtail/ quantal main

109

    fsid: d78ae656-7476-11e2-a532-1231390a9d4b

110

    monitor-secret: AQDcNRlR6MMZNRAAWw3iAobsJ1MLoFBLJYo4yg==

111

112

ceph-osd:

113

    source: http://ceph.com/debian-bobtail/ quantal main

114

    osd-devices: /dev/xvdf

115

116

ceph-radosgw:

117

    source: http://ceph.com/debian-bobtail/ quantal main

118

</pre>

119

120

You'll notice we're also passing a 'source' item to Juju, this tells the charm where to grab the appropriate code for Ceph, in this case the "latest release":http://ceph.com/resources/downloads/ (Bobtail 0.56.3 when this was written) from Ceph.com.

121

122

h3. Tail Those Logs!

123

124

Since a good portion of this setup is experimental it's a good idea to tail the logs. Thankfully, Juju makes this extremely easy for you to do. Simply open a second term window, ssh to your client machine, and type:

125

<pre>

126

>juju debug-log

127

</pre>

128

129

This will aggregate all of the logs from your cluster into a single output for easy browsing in case something goes wrong.

130

Deploying Ceph Monitors

131

Time to start deploying our Ceph cluster! In this case we’re going to deploy the first three machines with ceph-mon (Ceph monitors) since we typically recommend at least three in order to reach a quorum. You'll want to wait until all three machines are up before moving on.

132

<pre>

133

> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph

134

</pre>

135

136

You’ll notice that while these charms are in the charm store (cs:) they are off on my own user space. This is because I had to make a few tweaky changes for these charms to deploy happily on ec2 and use bobtail and quantal. These charms are still a bit new so if you have tweaks or changes feel free to give me a shout, or play with the main Ceph charms on jujucharms.com. In the future you’ll be able to deploy using just ‘ceph’ instead of anyone's user space.

137

<pre>

138

EXAMPLE: > juju deploy -n 3 --config ceph.yaml ceph

139

</pre>

140

141

This could take a while, so just keep checking 'juju status' until you have the machines running AND the agents set to 'started.' You should also see the debug-log go through a flurry of activity when it starts getting close to the end.

142

Once we have the monitors up and running you can take a look at what your deployment looks like. If you want to you can even ssh in to one of the machines using Juju’s built-in ssh tool.

143

<pre>

144

> juju status

145

146

machines:

147

0:

148

    agent-state: running

149

    dns-name: ec2-50-16-15-64.compute-1.amazonaws.com

150

    instance-id: i-2b45f657

151

    instance-state: running

152

1:

153

    agent-state: running

154

    dns-name: ec2-50-19-23-167.compute-1.amazonaws.com

155

    instance-id: i-3b368547

156

    instance-state: running

157

2:

158

    agent-state: running

159

    dns-name: ec2-107-22-128-107.compute-1.amazonaws.com

160

    instance-id: i-1f368563

161

    instance-state: running

162

3:

163

    agent-state: running

164

    dns-name: ec2-174-129-51-96.compute-1.amazonaws.com

165

    instance-id: i-15368569

166

    instance-state: running

167

services:

168

  ceph:

169

    charm: cs:~pmcgarry/quantal/ceph-0

170

    relations:

171

      mon:

172

      - ceph

173

    units:

174

      ceph/0:

175

        agent-state: started

176

        machine: 1

177

        public-address: ec2-50-19-23-167.compute-1.amazonaws.com

178

      ceph/1:

179

        agent-state: started

180

        machine: 2

181

        public-address: ec2-107-22-128-107.compute-1.amazonaws.com

182

      ceph/2:

183

        agent-state: started

184

        machine: 3

185

        public-address: ec2-174-129-51-96.compute-1.amazonaws.com

186

</pre>

187

188

<pre>

189

> juju ssh ceph/0 sudo ceph -s

190

191

   health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds

192

   monmap e2: 3 mons at {ceph232118103=10.243.121.227:6789/0,ceph501969115=10.245.210.114:6789/0,ceph5423414494=10.245.89.32:6789/0}, election epoch 6, quorum 0,1,2 ceph232118103,ceph501969115,ceph5423414494

193

   osdmap e1: 0 osds: 0 up, 0 in

194

    pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail

195

   mdsmap e1: 0/0/1 up

196

</pre>

197

198

From this status we can see that there are three monitors up ("monmap e2: 3 mons at {...}") and no OSDs ("osdmap e1: 0 osds: 0 up, 0 in"). Time to spin up some homes for those bits!

199

200

h3. Deploying OSDs

201

202

Once our monitors look healthy it’s time to spin up some OSDs. Feel free to drop as many in as you please, for the purposes of this experiment I chose to spin up three.

203

<pre>

204

> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph-osd

205

</pre>

206

207

That will take a little bit to complete so you may want to go grab an infusion of caffeine at this point. One thing to keep in mind is that earlier in our ceph.yaml we defined the physical devices for our OSDs as /dev/xvdf. If you are familiar with EC2 you will know that that device doesn't exist yet, so our OSD deploy command will spin up and configure boxes, but we're not quite there yet.

208

When you get back, if you take a look with juju status you should now see a bunch of new machines and a new section called ceph-osd:

209

<pre>

210

> juju status

211

212

…

213

  ceph-osd:

214

    charm: cs:~pmcgarry/quantal/ceph-osd-0

215

    relations: {}

216

    units:

217

      ceph-osd/0:

218

        agent-state: started

219

        machine: 4

220

        public-address: ec2-174-129-82-169.compute-1.amazonaws.com

221

      ceph-osd/1:

222

        agent-state: started

223

        machine: 5

224

        public-address: ec2-50-16-0-95.compute-1.amazonaws.com

225

      ceph-osd/2:

226

        agent-state: started

227

        machine: 6

228

        public-address: ec2-75-101-175-213.compute-1.amazonaws.com

229

</pre>

230

231

Now we need to actually give it the disks it needs. Via your EC2 console (or using ec2 command line tools) you need to spin up 3 EBS volumes and attach one to each of your OSD machines. If you need help there is a pretty decent, concise walkthrough at:

232

http://www.webmastersessions.com/how-to-attach-ebs-volume-to-amazon-ec2-instance

233

Once you have the volumes attached we need to tell Juju to go back and use them:

234

<pre>

235

> juju set ceph-osd "osd-devices=/dev/xvdf"

236

</pre>

237

238

This will trigger a rescan and get your OSDs functioning. All that’s left now is to connect our monitor cluster with the new pool of OSDs.

239

<pre>

240

> juju add-relation ceph-osd ceph

241

</pre>

242

243

We can ssh into one of the Ceph boxes and take a look at our cluster now:

244

<pre>

245

>juju ssh ceph/0

246

247

> sudo ceph -s

248

249

   health HEALTH_OK

250

   monmap e2: 3 mons at {ceph232118103=10.243.121.227:6789/0,ceph501969115=10.245.210.114:6789/0,ceph5423414494=10.245.89.32:6789/0}, election epoch 6, quorum 0,1,2 ceph232118103,ceph501969115,ceph5423414494

251

   osdmap e10: 3 osds: 3 up, 3 in

252

    pgmap v115: 208 pgs: 208 active+clean; 0 bytes data, 3102 MB used, 27584 MB / 30686 MB avail

253

   mdsmap e1: 0/0/1 up

254

</pre>

255

256

Congratulations, you now have a Ceph cluster! Feel free to write a few apps against it, show it to all of your friends, or just nuke it and start refining your chops for a production deployment.

257

258

h3. Extra Credit

259

260

Since that Juju GUI screen looked so empty I decided I wanted to play a bit more with the tools at my disposal. If you would like to take this exercise a bit further we can also add a few RADOS Gateway machines and load-balance them behind an haproxy machine. To do this is only a few more commands with Juju:

261

<pre>

262

> juju deploy -n 3 --config ceph.yaml cs:~pmcgarry/quantal/ceph-radosgw

263

> juju expose ceph-radosgw

264

265

> juju deploy cs:~pmcgarry/quantal/haproxy

266

> juju expose haproxy

267

268

> juju add-relation ceph-radosgw haproxy

269

</pre>

270

271

That should be it! You'll notice that I have my own copy of the haproxy, this is simply because it isn't technically released for quantal yet, but my (unmodified) version seems to run just fine.

272

273

h3. Troubleshooting

274

275

Juju actually makes troubleshooting and iterative development VERY easy (one of my favorite things about it). If you would like to delve deeper into playing with Juju I highly recommend reading their docs, which are quite good. However, one of the most useful tools (beyond the debug-log I mentioned earlier) is the ability to step through the hooks as juju tries to run them. For example, lets say we tried to deploy Ceph and 'juju status' was telling us there was an 'install-error.' We could use our second term window to execute the following:

276

<pre>

277

> juju debug-hooks ceph/0

278

</pre>

279

280

This allows us to debug the execution of the hooks on a specific machine (in this case ceph/0). Now in our main window we can type:

281

<pre>

282

> juju resolved --retry ceph/0

283

</pre>

284

285

We get a preformatted setup in our 'debug-hooks' window with an indication at the bottom that we're on the "install" hook. From here we can change to the hooks directory and rerun the install hook:

286

<pre>

287

> cd hooks

288

> ./install

289

</pre>

290

291

From here we can troubleshoot errors on this box before going back and pushing a patch to Launchpad.net. I wont try to recreate the expansive documentation on the jujucharms site, but fiddling with Juju has been far less frustrating that some other orchestration frameworks I have poked at recently. Good luck, and happy charming!

292

293

h3. Cleaning Up

294

295

If you would like to close up shop you can either destroy just the services (if you want to keep the machines running for deploying other Juju tests):

296

<pre>

297

> juju destroy-service ceph

298

> juju destroy-service ceph-osd

299

> juju destroy-service ceph-radosgw

300

> juju destroy-service haproxy

301

</pre>

302

303

...or just drop some dynamite on the whole thing (this will kill everything but your client machine, including your bootstrap environment):

304

<pre>

305

> juju destroy-environment

306

</pre>

307

308

h3. Wrap Up

309

310

You are now a seasoned veteran of Ceph deployment, what more could you want? If you do have questions, comments, or anything for the good of the cause we would love to hear about it. Currently the best way to get help or give feedback is in our "#Ceph irc channel":irc://irc.oftc.net:6667/ceph but our "mailing lists":http://ceph.com/resources/mailing-list-irc/ are also pretty active. For Juju-specific feedback you can also hit up the "#Juju irc channel":irc://irc.freenode.net:6667/juju. If you see any egregious errors on this writeup or would like to know more about Ceph community plans feel free to send email to patrick at inktank dot com.

311

312

313

<pre>

314

scuttlemonkey out

315

</pre>

Project

General

Profile

Ceph

Deploying Ceph with Juju » History » Version 3