Project

General

Profile

Create Versionable and Fault-Tolerant Storage Devices with Ceph and VirtualBox » History » Version 4

Jessica Mack, 07/03/2015 10:31 PM

1 1 Jessica Mack
h1. Create Versionable and Fault-Tolerant Storage Devices with Ceph and VirtualBox
2
3 3 Jessica Mack
{{toc}}
4
5 1 Jessica Mack
h3. Introducing Ceph
6
7
The cloud is becoming ubiquitous and more and more enterprises, applications and end-users are transitioning their data to cloud platforms. This has created a new challenge for cloud providers, both private and public: building fault-tolerant, high-performance infrastructure that supports big data storage and processing needs, yet is easy to scale and flexible enough to support multiple use cases.
8 3 Jessica Mack
That's where "Ceph":http://ceph.com/ comes in. Ceph is a distributed storage system designed to provide excellent performance, reliability and scalability. It's open source, so it's freely downloadable and usable, and it offers a unified storage interface that's versatile enough to support object storage, block device storage and file system storage all in the same Ceph cluster.
9
Ceph is highly reliable, but setting it up can be a little complex, especially if you're new to scalable storage. Ceph is highly reliable, with self-healing and self-managing characteristics. While the "quickstart doc":http://ceph.com/docs/master/start/ is great, we also wanted to provide something a bit more procedural to get you started. That's where this tutorial comes in. Over the next few pages, I'll walk you through the process of building a simple Ceph storage cluster and adding data to it. We'll set up the cluster using VirtualBox, so you'll get a chance to see Ceph in action in a "real" environment where you have total control, but which doesn't cost you anything to run or scale out with new nodes.
10 1 Jessica Mack
Intrigued? Keep reading.
11
12
h3. Assumptions and Requirements
13
14 3 Jessica Mack
For this tutorial, I'll be using "VirtualBox":https://www.virtualbox.org/, which provides an easy way to set up independent virtual servers, with "CentOS":http://www.centos.org/ as the operating system for the virtual servers. VirtualBox is available for Windows, Linux, Macintosh, and Solaris hosts. I'll make the following assumptions:
15 1 Jessica Mack
* You have a working knowledge of CentOS, VirtualBox and VirtualBox networking.
16
* You have downloaded and installed the latest version of VirtualBox.
17
* You have either already configured 5 virtual CentOS servers, or you have downloaded an ISO installation image for the latest version of CentOS (CentOS 7.0 at the time of writing). These servers must be using kernel version 3.10 or later
18
* You're familiar with installing software using the yum, the CentOS package manager.
19
* You’re familiar with SSH-based authentication.
20
21
In case you’re not familiar with the above topics, look in the “Read More” section at the end of this tutorial, which has links to relevant guides.
22
To set up a Ceph storage cluster with VirtualBox, here are the steps you'll follow:
23
# Create cluster nodes
24
# Install the Ceph deployment toolkit
25
# Configure authentication between cluster nodes
26
# Configure and activate a cluster monitor
27
# Prepare and activate OSDs
28
# Verify cluster health
29
# Test the cluster
30
# Connect a Ceph block device to the cluster
31
32
The next sections will walk you through these steps in detail.
33
34
h3. Step 1: Create Cluster Nodes
35
36
If you already have 5 virtual CentOS servers configured and talking to each other, you can skip this step. If not, you must first create the virtual servers that will make up your Ceph cluster. To do this:
37 3 Jessica Mack
1. Launch VirtualBox and use the _Machine -> New_ menu to create a new virtual server.
38 1 Jessica Mack
39
!image1.jpg!
40
41 3 Jessica Mack
2. Keeping in mind that you will need 5 virtual servers running simultaneously, calculate the available RAM on the host system and set the server memory accordingly.
42 1 Jessica Mack
43
!image2.jpg!
44
45 3 Jessica Mack
3. Add a virtual hard drive of at least 10 GB.
46 1 Jessica Mack
47
!image3.jpg!
48
49 3 Jessica Mack
4. Ensure that you have an IDE controller with a virtual CD/DVD drive (to enable CentOS installation) and at least two network adapters, one NAT (to enable download of required software) and one bridged adapter or internal network adapter (for internal communication between the cluster nodes).
50
5. Once the server basics are defined, install CentOS to the server using the ISO installation image. Ensure that your kernel version is at least 3.10 or later.
51
6. Once the installation process is complete, log in to the server and configure the second network interface with a static IP address, by editing the appropriate template file in the _/etc/sysconfig/network-scripts/_ directory. Here's a sample of what the interface configuration might look like:
52
53
p(. @HWADDR=08:00:27:AE:14:41
54 1 Jessica Mack
TYPE=Ethernet
55
BOOTPROTO=static
56
DEFROUTE=yes
57
PEERDNS=yes
58
PEERROUTES=yes
59
IPV4_FAILURE_FATAL=no
60
IPV6INIT=yes
61
IPV6_AUTOCONF=yes
62
IPV6_DEFROUTE=yes
63
IPV6_PEERDNS=yes
64
IPV6_PEERROUTES=yes
65
IPV6_FAILURE_FATAL=no
66
NAME=enp0s8
67
UUID=5fc74119-1ab2-4c0c-9aa1-284fd484e6c6
68
ONBOOT=no
69
IPADDR=192.168.1.25
70
NETMASK=255.255.255.0
71
GATEWAY=192.168.1.1
72
DNS1=192.168.1.1
73
DNS2=8.8.8.8@
74 3 Jessica Mack
75
Should any of the above steps be unfamiliar to you, refer to the "VirtualBox manual":https://www.virtualbox.org/manual/UserManual.html, especially the "VirtualBox networking guide":https://www.virtualbox.org/manual/ch06.html, and to the "networking section of the CentOS deployment guide":https://www.virtualbox.org/manual/ch06.html.
76
77
Repeat this process until you have 5 virtual servers. Of these, identify one as the cluster administration node and assign it the hostname _admin-node_. The remaining servers may be identified with hostnames such as _node1_, _node2_, and so on. Here's an example of what the final cluster might look like (note that you should obviously modify the IP addresses to match your local network settings).
78 1 Jessica Mack
 
79 3 Jessica Mack
|*Server host name*|*IP address*|*Purpose*|
80 1 Jessica Mack
|admin-node|	192.168.1.25|	Administration node for cluster|
81
|node1|	192.168.1.26|	Monitor|
82
|node2|	192.168.1.27|	OSD daemon|
83
|node3|	192.168.1.28|	OSD daemon|
84
|node4|	192.168.1.29|	Client (block device)|
85
86 3 Jessica Mack
Before proceeding to the next step, ensure that all the servers are accessible by pinging them using their host names. If you don't have a local DNS server, add the host names and IP addresses to each server's _/etc/hosts_ file to ease network access.
87 1 Jessica Mack
88
h3. Step 2: Install the Ceph Deployment Toolkit
89
90
The next step is to install the Ceph deployment toolkit on the administration node. This toolkit will help install Ceph on the nodes in the cluster, as well as prepare and activate the cluster.
91 3 Jessica Mack
1. Log in to the administration node as the root user.
92
2. Add the package to the yum repository by creating a new file at _/etc/yum.repos.d/ceph.repo_ with the following content:
93 1 Jessica Mack
@[ceph-noarch]
94
name=Ceph noarch packages
95
baseurl=http://ceph.com/rpm-firefly/el7/noarch
96
enabled=1
97
gpgcheck=1
98
type=rpm-md
99
gpgkey=https://ceph.com/git/?p=ceph.git;a=b...ys/release.asc@
100 3 Jessica Mack
3. Update the repository.
101 1 Jessica Mack
@shell> yum update@
102 3 Jessica Mack
4. Install the Ceph deployment toolkit.
103 1 Jessica Mack
@shell> yum install ceph-deploy@
104
 
105
!image4.jpg!
106
107
h3. Step 3: Configure Authentication between Cluster Nodes
108
109
Now, you need to create a ceph user on each server in the cluster, including the administration node. This user account will handle performing cluster-related operations on each node. Perform the following steps on each of the 5 virtual servers:
110 3 Jessica Mack
1. Log in as the _root_ user.
111
2. Create a _ceph_ user account.
112
@shell> useradd ceph
113
shell> passwd ceph@
114
3. Give the _ceph_ user account root privileges with _sudo_.
115
@shell> echo "ceph ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/ceph
116
shell> chmod 0440 /etc/sudoers.d/ceph@
117
4. Disable 'requiretty' for the _ceph_ user.
118
@shell> sudo visudo@
119
5. In the resulting file, locate the line containing
120
@Defaults requiretty@
121 1 Jessica Mack
and change it to read
122 3 Jessica Mack
@Defaults:ceph !requiretty@
123
124 1 Jessica Mack
Now, set up passphraseless SSH between the nodes:
125 3 Jessica Mack
1. Log in to the administration node as the _ceph_ user.
126
2. Generate an SSH key for the administration node.
127
@shell> ssh-keygen@
128 1 Jessica Mack
129
!image5.jpg!
130
131 3 Jessica Mack
3. Copy the generated public key to the _ceph_ user account of all the nodes in the cluster.
132
@shell> ssh-copy-id ceph@node1
133 1 Jessica Mack
shell> ssh-copy-id ceph@node2
134
shell> ssh-copy-id ceph@node3
135
shell> ssh-copy-id ceph@node4
136 3 Jessica Mack
shell> ssh-copy-id ceph@admin-node@
137 1 Jessica Mack
138
!image6.jpg!
139
140 3 Jessica Mack
4. Test that the _ceph_ user on the administration node can log in to any other node as _ceph_ using SSH and without providing a password.
141
@shell> ssh ceph@node1@
142 1 Jessica Mack
143
!image7.jpg!
144
145 4 Jessica Mack
Modify the administration node's SSH configuration file so that it can easily log in to each node as the ceph user. Create the _/home/ceph/.ssh/config_ file with the following lines:
146
@Host node1
147 1 Jessica Mack
  Hostname node1
148
  User ceph
149
Host node2
150
  Hostname node2
151
  User ceph
152
Host node3
153
  Hostname node3
154
  User ceph
155
Host node4
156
  Hostname node4
157
  User ceph
158
Host admin-node
159
  Hostname admin-node
160 4 Jessica Mack
  User ceph@
161
Change the permissions of the _/home/ceph/.ssh/config_ file.
162
@shell> chmod 0400 ~/.ssh/config@
163
Test that the _ceph_ user on the administration node can log in to any other node using SSH and without providing a password or username.
164
@shell> ssh node1@
165 1 Jessica Mack
166
!image8.jpg!
167
 
168
Finally, create a directory on the administration node to store cluster information, such as configuration files and keyrings.
169 4 Jessica Mack
@shell> mkdir my-cluster
170
shell> cd my-cluster@
171 1 Jessica Mack
You're now ready to begin preparing and activating the cluster!
172
173
h3. Step 4: Configure and Activate a Cluster Monitor
174
175
A Ceph storage cluster consists of two types of daemons:
176 4 Jessica Mack
* Monitors maintain copies of the cluster map
177
* Object Storage Daemons (OSD) store data as objects on storage nodes
178
179
Apart from this, other actors in a Ceph storage cluster include metadata servers and clients such as Ceph block devices, Ceph object gateways or Ceph filesystems. "Read more about Ceph’s architecture.":http://ceph.com/docs/master/architecture/
180
181
All the commands in this and subsequent sections are to be run when logged in as the _ceph_ user on the administration node, from the _my-cluster/_ directory. Ensure that you are directly logged in as _ceph_ and are not using _root_ with _su - ceph_.
182
183 1 Jessica Mack
A minimal system will have at least one monitor and two OSD daemons for data replication.
184 4 Jessica Mack
1. Begin by setting up a Ceph monitor on node1 with the Ceph deployment toolkit.
185
@shell> ceph-deploy new node1@
186 1 Jessica Mack
This will define the name of the initial monitor node and create a default Ceph configuration file and monitor keyring in the current directory.
187
188
!image9.jpg!
189
 
190 4 Jessica Mack
2. Change the number of replicas in the Ceph configuration file at _/home/ceph/my-cluster/ceph.conf_ from 3 to 2 so that Ceph can achieve a stable state with just two OSDs. Add the following line in the [global] section:
191
@osd pool default size = 2
192
osd pool default min size = 2@
193
3. In the same file, set the OSD journal size. A good general setting is 10 GB; however, since this is a simulation, you can use a smaller amount such as 4 GB. Add the following line in the [global] section:
194
@osd journal size = 4000@
195
4. In the same file, set the default number of placement groups for a pool. Since we’ll have less than 5 OSDs, 128 placement groups per pool should suffice. Add the following line in the [global] section:
196
@osd pool default pg num = 128@
197
5. Install Ceph on each node in the cluster, including the administration node.
198
@shell> ceph-deploy install admin-node node1 node2 node3 node4@
199 1 Jessica Mack
The Ceph deployment toolkit will now go to work installing Ceph on each node. Here's an example of what you will see during the installation process.
200
201
!image10.jpg!
202
203 4 Jessica Mack
Create the Ceph monitor on _node1_ and gather the initial keys.
204
@shell> ceph-deploy mon create-initial node1@
205 1 Jessica Mack
206
!image11.jpg!
207
208
h3. Step 5: Prepare and Activate OSDs
209
210 4 Jessica Mack
The next set is to prepare and activate Ceph OSDs. We'll need a minimum of 2 OSDs, and these should be set up on _node2_ and _node3_, as it's not recommended to mix monitors and OSD daemons on the same host. To begin, set up an OSD on _node2_ as follows:
211
Log into _node2_ as the _ceph_ user.
212
@shell> ssh node2@
213 1 Jessica Mack
Create a directory for the OSD daemon.
214 4 Jessica Mack
@shell> sudo mkdir /var/local/osd@
215
Log out of _node2_. Then, from the administrative node, prepare and activate the OSD.
216
@shell> ceph-deploy osd prepare node2:/var/local/osd@
217 1 Jessica Mack
218
!image12.jpg!
219
220 4 Jessica Mack
@shell> ceph-deploy osd activate node2:/var/local/osd@
221 1 Jessica Mack
222
!image13.jpg!
223
224 4 Jessica Mack
Repeat the above steps for _node3_.
225 1 Jessica Mack
At this point, the OSD daemons have been created and the storage cluster is ready.
226
227
h3. Step 6: Verify Cluster Health
228
229
Copy the configuration file and admin keyring from the administration node to all the nodes in the cluster.
230 4 Jessica Mack
@shell> ceph-deploy admin admin-node node1 node2 node3 node4@
231 1 Jessica Mack
232
!image14.jpg!
233
234 4 Jessica Mack
Log in to each node as the _ceph_ user and change the permissions of the admin keyring.
235
@shell> ssh node1
236
shell> sudo chmod +r /etc/ceph/ceph.client.admin.keyring@
237
You should now be able to check cluster health from any node in the cluster with the _ceph status_ command. Ideally, you want to see the status _active + clean_, as that indicates the cluster is operating normally.
238
@shell> ceph status@
239 1 Jessica Mack
240
!image15.jpg!
241
242
h3. Step 7: Test the Cluster
243
244
You can now perform a simple test to see the distributed Ceph storage cluster in action, by writing a file on one node and retrieving it on another:
245 4 Jessica Mack
1. Log in to _node1_ as the _ceph_ user.
246
@shell> ssh node1@
247
2. Create a new file with some dummy data.
248
@shell> echo "Hello world" > /tmp/hello.txt@
249
3. Data is stored in Ceph within storage pools, which are logical groups in which to organize your data. By default, a Ceph storage cluster has 3 pools - _data, metadata_ and _rbd_ - and it's also possible to create your own custom pools. In this case, copy the file to the _data_ pool with the _rados put_ command and assign it a name.
250
@shell> rados put hello-object /tmp/hello.txt --pool data@
251 1 Jessica Mack
To verify that the Ceph storage cluster stored the object:
252 4 Jessica Mack
1. Log in to _node2_ as the _ceph_ user.
253
2. Check that the file exists in the cluster's _data_ storage pool with the _rados ls_ command.
254
@shell> rados ls --pool data@
255
3. Copy the file out of the storage cluster to a local directory with the _rados get_ command and verify its contents
256
@shell> rados get hello-object /tmp/hello.txt --pool data
257
shell> cat hello.txt@
258 1 Jessica Mack
259
!image16.jpg!
260
261
h3. Step 8: Connect a Ceph Block Device to the Cluster
262
263 4 Jessica Mack
Now that the cluster is operating, it’s time to do something with it. Ceph storage clusters can be accessed by three types of clients: Ceph block devices, Ceph object gateways and the Ceph filesystem (CephFS). The simplest to demonstrate is the RADOS Block Device (RBD), so in this step you'll create a virtual block device client on _node4_, associate it with a storage pool and then read and write data to it.
264
1. Log in to _node4_ as the _ceph_ user.
265
@shell> ssh node4@
266
2. Create a pool named _work_. Within the pool, specify the number of placement groups, which are the number of shards or fragments that the pool is divided into. Placement groups are mapped to OSDs and a larger number of placement groups (such as 100 per OSD) leads to better balancing.
267
@shell> ceph osd pool create work 100 100@
268
3. Create a RADOS Block Device image with rbd and connect it with the pool.
269
@shell> rbd create image01 --size 1024 --pool work@
270
4. Map the block device image to the actual block device.
271
@shell> sudo rbd map image01 --pool work --name client.admin@
272
5. Create a filesystem on the block device and mount it.
273
@shell> sudo /sbin/mkfs.ext4 -m0 /dev/rbd/work/image01
274 1 Jessica Mack
shell> sudo mkdir /mnt/ceph-block-device
275 4 Jessica Mack
shell> sudo mount /dev/rbd/work/image01 /mnt/ceph-block-device@
276 1 Jessica Mack
277
!image17.jpg!
278
279
At this point, your Ceph block device is mounted and ready for use. You can write data to it as with any other block device, and your data will be automatically stored in the cluster (with all its resiliency and scalability benefits). Plus, you get a bunch of cool features, such as the ability to create device snapshots, so you can easily roll back to a previous image of the device.
280
To demonstrate this:
281 4 Jessica Mack
1. Navigate to where you mounted the block device and create a text file. You might need to first change the permissions of the mount point so that you can write to it.
282
@shell> cd /mnt/ceph-block-device
283
shell> vi today.txt@
284
2. Add the word 'Monday' to the file.
285
3. Then, take a snapshot of the image.
286
@shell> rbd --pool work snap create --snap monday image01@
287
4. Edit the file again and this time change the contents to 'Friday'.
288
5. Unmount the block device. Then, roll back to the previous snapshot and mount it again.
289
@shell> sudo umount /mnt/ceph-block-device
290 1 Jessica Mack
shell> rbd --pool work snap rollback --snap monday image01
291 4 Jessica Mack
shell> sudo mount /dev/rbd/work/image01 /mnt/ceph-block-device@
292 1 Jessica Mack
When you inspect the contents of the file, you will see the original contents restored.
293 4 Jessica Mack
294 1 Jessica Mack
You must unmount the block device before doing a rollback. If it's mounted when you rollback, the client will have a stale cache, which may cause filesystem corruption, since it's like a hard drive being written by two machines at once.
295
296 2 Jessica Mack
h3. Conclusion
297
298 4 Jessica Mack
Just as you can use Ceph block devices, so too can you use the "Ceph Object Gateway":http://ceph.com/docs/master/radosgw/ to create Amazon S3-style storage buckets accessible via REST, or "CephFS":http://ceph.com/docs/master/cephfs/ as a POSIX-compliant scalable, fault-tolerant network filesystem. Setting these up is beyond the scope of this beginner tutorial, but since they both use the Ceph storage cluster that you've already configured, it won't take long for you to get them running. Review "instructions for the Ceph Object Gateway":http://ceph.com/docs/master/start/quick-rgw/ and "instructions for CephFS":http://ceph.com/docs/master/start/quick-cephfs/.
299 2 Jessica Mack
As this tutorial will have illustrated, Ceph is a powerful solution for creating resilient, infinitely scalable storage. The simple storage cluster you created here with VirtualBox is just the tip of the iceberg: as you transition the cluster to your network or the cloud and add more nodes, you'll benefit from improved performance and flexibility without any loss in reliability and security. What more could you ask for?
300
301
h3. Read More
302
303
* "Introduction to Ceph":http://ceph.com/docs/master/start/intro/
304 1 Jessica Mack
* "Ceph Architecture":http://ceph.com/docs/master/architecture/
305
* "Ceph Storage Cluster Quick Start":http://ceph.com/docs/master/rados/
306
* "Getting Started With Ceph":http://www.inktank.com/resource/getting-started-with-ceph-miroslav-klivansky/
307
* "Introduction to Ceph & OpenStack":http://www.inktank.com/resource/introduction-to-ceph-openstack-miroslav-klivansky/   
308
* "Managing A Distributed Storage System At Scale":http://www.inktank.com/resource/managing-a-distributed-storage-system-at-scale-sage-weil/
309
* "Scaling Storage With Ceph":http://www.inktank.com/resource/scaling-storage-with-ceph-ross-turk/
310
* "Ceph API Documentation":http://ceph.com/docs/master/api/