Create Versionable and Fault-Tolerant Storage Devices with Ceph and VirtualBox » History » Version 4
Jessica Mack, 07/03/2015 10:31 PM
1 | 1 | Jessica Mack | h1. Create Versionable and Fault-Tolerant Storage Devices with Ceph and VirtualBox |
---|---|---|---|
2 | |||
3 | 3 | Jessica Mack | {{toc}} |
4 | |||
5 | 1 | Jessica Mack | h3. Introducing Ceph |
6 | |||
7 | The cloud is becoming ubiquitous and more and more enterprises, applications and end-users are transitioning their data to cloud platforms. This has created a new challenge for cloud providers, both private and public: building fault-tolerant, high-performance infrastructure that supports big data storage and processing needs, yet is easy to scale and flexible enough to support multiple use cases. |
||
8 | 3 | Jessica Mack | That's where "Ceph":http://ceph.com/ comes in. Ceph is a distributed storage system designed to provide excellent performance, reliability and scalability. It's open source, so it's freely downloadable and usable, and it offers a unified storage interface that's versatile enough to support object storage, block device storage and file system storage all in the same Ceph cluster. |
9 | Ceph is highly reliable, but setting it up can be a little complex, especially if you're new to scalable storage. Ceph is highly reliable, with self-healing and self-managing characteristics. While the "quickstart doc":http://ceph.com/docs/master/start/ is great, we also wanted to provide something a bit more procedural to get you started. That's where this tutorial comes in. Over the next few pages, I'll walk you through the process of building a simple Ceph storage cluster and adding data to it. We'll set up the cluster using VirtualBox, so you'll get a chance to see Ceph in action in a "real" environment where you have total control, but which doesn't cost you anything to run or scale out with new nodes. |
||
10 | 1 | Jessica Mack | Intrigued? Keep reading. |
11 | |||
12 | h3. Assumptions and Requirements |
||
13 | |||
14 | 3 | Jessica Mack | For this tutorial, I'll be using "VirtualBox":https://www.virtualbox.org/, which provides an easy way to set up independent virtual servers, with "CentOS":http://www.centos.org/ as the operating system for the virtual servers. VirtualBox is available for Windows, Linux, Macintosh, and Solaris hosts. I'll make the following assumptions: |
15 | 1 | Jessica Mack | * You have a working knowledge of CentOS, VirtualBox and VirtualBox networking. |
16 | * You have downloaded and installed the latest version of VirtualBox. |
||
17 | * You have either already configured 5 virtual CentOS servers, or you have downloaded an ISO installation image for the latest version of CentOS (CentOS 7.0 at the time of writing). These servers must be using kernel version 3.10 or later |
||
18 | * You're familiar with installing software using the yum, the CentOS package manager. |
||
19 | * You’re familiar with SSH-based authentication. |
||
20 | |||
21 | In case you’re not familiar with the above topics, look in the “Read More” section at the end of this tutorial, which has links to relevant guides. |
||
22 | To set up a Ceph storage cluster with VirtualBox, here are the steps you'll follow: |
||
23 | # Create cluster nodes |
||
24 | # Install the Ceph deployment toolkit |
||
25 | # Configure authentication between cluster nodes |
||
26 | # Configure and activate a cluster monitor |
||
27 | # Prepare and activate OSDs |
||
28 | # Verify cluster health |
||
29 | # Test the cluster |
||
30 | # Connect a Ceph block device to the cluster |
||
31 | |||
32 | The next sections will walk you through these steps in detail. |
||
33 | |||
34 | h3. Step 1: Create Cluster Nodes |
||
35 | |||
36 | If you already have 5 virtual CentOS servers configured and talking to each other, you can skip this step. If not, you must first create the virtual servers that will make up your Ceph cluster. To do this: |
||
37 | 3 | Jessica Mack | 1. Launch VirtualBox and use the _Machine -> New_ menu to create a new virtual server. |
38 | 1 | Jessica Mack | |
39 | !image1.jpg! |
||
40 | |||
41 | 3 | Jessica Mack | 2. Keeping in mind that you will need 5 virtual servers running simultaneously, calculate the available RAM on the host system and set the server memory accordingly. |
42 | 1 | Jessica Mack | |
43 | !image2.jpg! |
||
44 | |||
45 | 3 | Jessica Mack | 3. Add a virtual hard drive of at least 10 GB. |
46 | 1 | Jessica Mack | |
47 | !image3.jpg! |
||
48 | |||
49 | 3 | Jessica Mack | 4. Ensure that you have an IDE controller with a virtual CD/DVD drive (to enable CentOS installation) and at least two network adapters, one NAT (to enable download of required software) and one bridged adapter or internal network adapter (for internal communication between the cluster nodes). |
50 | 5. Once the server basics are defined, install CentOS to the server using the ISO installation image. Ensure that your kernel version is at least 3.10 or later. |
||
51 | 6. Once the installation process is complete, log in to the server and configure the second network interface with a static IP address, by editing the appropriate template file in the _/etc/sysconfig/network-scripts/_ directory. Here's a sample of what the interface configuration might look like: |
||
52 | |||
53 | p(. @HWADDR=08:00:27:AE:14:41 |
||
54 | 1 | Jessica Mack | TYPE=Ethernet |
55 | BOOTPROTO=static |
||
56 | DEFROUTE=yes |
||
57 | PEERDNS=yes |
||
58 | PEERROUTES=yes |
||
59 | IPV4_FAILURE_FATAL=no |
||
60 | IPV6INIT=yes |
||
61 | IPV6_AUTOCONF=yes |
||
62 | IPV6_DEFROUTE=yes |
||
63 | IPV6_PEERDNS=yes |
||
64 | IPV6_PEERROUTES=yes |
||
65 | IPV6_FAILURE_FATAL=no |
||
66 | NAME=enp0s8 |
||
67 | UUID=5fc74119-1ab2-4c0c-9aa1-284fd484e6c6 |
||
68 | ONBOOT=no |
||
69 | IPADDR=192.168.1.25 |
||
70 | NETMASK=255.255.255.0 |
||
71 | GATEWAY=192.168.1.1 |
||
72 | DNS1=192.168.1.1 |
||
73 | DNS2=8.8.8.8@ |
||
74 | 3 | Jessica Mack | |
75 | Should any of the above steps be unfamiliar to you, refer to the "VirtualBox manual":https://www.virtualbox.org/manual/UserManual.html, especially the "VirtualBox networking guide":https://www.virtualbox.org/manual/ch06.html, and to the "networking section of the CentOS deployment guide":https://www.virtualbox.org/manual/ch06.html. |
||
76 | |||
77 | Repeat this process until you have 5 virtual servers. Of these, identify one as the cluster administration node and assign it the hostname _admin-node_. The remaining servers may be identified with hostnames such as _node1_, _node2_, and so on. Here's an example of what the final cluster might look like (note that you should obviously modify the IP addresses to match your local network settings). |
||
78 | 1 | Jessica Mack | |
79 | 3 | Jessica Mack | |*Server host name*|*IP address*|*Purpose*| |
80 | 1 | Jessica Mack | |admin-node| 192.168.1.25| Administration node for cluster| |
81 | |node1| 192.168.1.26| Monitor| |
||
82 | |node2| 192.168.1.27| OSD daemon| |
||
83 | |node3| 192.168.1.28| OSD daemon| |
||
84 | |node4| 192.168.1.29| Client (block device)| |
||
85 | |||
86 | 3 | Jessica Mack | Before proceeding to the next step, ensure that all the servers are accessible by pinging them using their host names. If you don't have a local DNS server, add the host names and IP addresses to each server's _/etc/hosts_ file to ease network access. |
87 | 1 | Jessica Mack | |
88 | h3. Step 2: Install the Ceph Deployment Toolkit |
||
89 | |||
90 | The next step is to install the Ceph deployment toolkit on the administration node. This toolkit will help install Ceph on the nodes in the cluster, as well as prepare and activate the cluster. |
||
91 | 3 | Jessica Mack | 1. Log in to the administration node as the root user. |
92 | 2. Add the package to the yum repository by creating a new file at _/etc/yum.repos.d/ceph.repo_ with the following content: |
||
93 | 1 | Jessica Mack | @[ceph-noarch] |
94 | name=Ceph noarch packages |
||
95 | baseurl=http://ceph.com/rpm-firefly/el7/noarch |
||
96 | enabled=1 |
||
97 | gpgcheck=1 |
||
98 | type=rpm-md |
||
99 | gpgkey=https://ceph.com/git/?p=ceph.git;a=b...ys/release.asc@ |
||
100 | 3 | Jessica Mack | 3. Update the repository. |
101 | 1 | Jessica Mack | @shell> yum update@ |
102 | 3 | Jessica Mack | 4. Install the Ceph deployment toolkit. |
103 | 1 | Jessica Mack | @shell> yum install ceph-deploy@ |
104 | |||
105 | !image4.jpg! |
||
106 | |||
107 | h3. Step 3: Configure Authentication between Cluster Nodes |
||
108 | |||
109 | Now, you need to create a ceph user on each server in the cluster, including the administration node. This user account will handle performing cluster-related operations on each node. Perform the following steps on each of the 5 virtual servers: |
||
110 | 3 | Jessica Mack | 1. Log in as the _root_ user. |
111 | 2. Create a _ceph_ user account. |
||
112 | @shell> useradd ceph |
||
113 | shell> passwd ceph@ |
||
114 | 3. Give the _ceph_ user account root privileges with _sudo_. |
||
115 | @shell> echo "ceph ALL = (root) NOPASSWD:ALL" | tee /etc/sudoers.d/ceph |
||
116 | shell> chmod 0440 /etc/sudoers.d/ceph@ |
||
117 | 4. Disable 'requiretty' for the _ceph_ user. |
||
118 | @shell> sudo visudo@ |
||
119 | 5. In the resulting file, locate the line containing |
||
120 | @Defaults requiretty@ |
||
121 | 1 | Jessica Mack | and change it to read |
122 | 3 | Jessica Mack | @Defaults:ceph !requiretty@ |
123 | |||
124 | 1 | Jessica Mack | Now, set up passphraseless SSH between the nodes: |
125 | 3 | Jessica Mack | 1. Log in to the administration node as the _ceph_ user. |
126 | 2. Generate an SSH key for the administration node. |
||
127 | @shell> ssh-keygen@ |
||
128 | 1 | Jessica Mack | |
129 | !image5.jpg! |
||
130 | |||
131 | 3 | Jessica Mack | 3. Copy the generated public key to the _ceph_ user account of all the nodes in the cluster. |
132 | @shell> ssh-copy-id ceph@node1 |
||
133 | 1 | Jessica Mack | shell> ssh-copy-id ceph@node2 |
134 | shell> ssh-copy-id ceph@node3 |
||
135 | shell> ssh-copy-id ceph@node4 |
||
136 | 3 | Jessica Mack | shell> ssh-copy-id ceph@admin-node@ |
137 | 1 | Jessica Mack | |
138 | !image6.jpg! |
||
139 | |||
140 | 3 | Jessica Mack | 4. Test that the _ceph_ user on the administration node can log in to any other node as _ceph_ using SSH and without providing a password. |
141 | @shell> ssh ceph@node1@ |
||
142 | 1 | Jessica Mack | |
143 | !image7.jpg! |
||
144 | |||
145 | 4 | Jessica Mack | Modify the administration node's SSH configuration file so that it can easily log in to each node as the ceph user. Create the _/home/ceph/.ssh/config_ file with the following lines: |
146 | @Host node1 |
||
147 | 1 | Jessica Mack | Hostname node1 |
148 | User ceph |
||
149 | Host node2 |
||
150 | Hostname node2 |
||
151 | User ceph |
||
152 | Host node3 |
||
153 | Hostname node3 |
||
154 | User ceph |
||
155 | Host node4 |
||
156 | Hostname node4 |
||
157 | User ceph |
||
158 | Host admin-node |
||
159 | Hostname admin-node |
||
160 | 4 | Jessica Mack | User ceph@ |
161 | Change the permissions of the _/home/ceph/.ssh/config_ file. |
||
162 | @shell> chmod 0400 ~/.ssh/config@ |
||
163 | Test that the _ceph_ user on the administration node can log in to any other node using SSH and without providing a password or username. |
||
164 | @shell> ssh node1@ |
||
165 | 1 | Jessica Mack | |
166 | !image8.jpg! |
||
167 | |||
168 | Finally, create a directory on the administration node to store cluster information, such as configuration files and keyrings. |
||
169 | 4 | Jessica Mack | @shell> mkdir my-cluster |
170 | shell> cd my-cluster@ |
||
171 | 1 | Jessica Mack | You're now ready to begin preparing and activating the cluster! |
172 | |||
173 | h3. Step 4: Configure and Activate a Cluster Monitor |
||
174 | |||
175 | A Ceph storage cluster consists of two types of daemons: |
||
176 | 4 | Jessica Mack | * Monitors maintain copies of the cluster map |
177 | * Object Storage Daemons (OSD) store data as objects on storage nodes |
||
178 | |||
179 | Apart from this, other actors in a Ceph storage cluster include metadata servers and clients such as Ceph block devices, Ceph object gateways or Ceph filesystems. "Read more about Ceph’s architecture.":http://ceph.com/docs/master/architecture/ |
||
180 | |||
181 | All the commands in this and subsequent sections are to be run when logged in as the _ceph_ user on the administration node, from the _my-cluster/_ directory. Ensure that you are directly logged in as _ceph_ and are not using _root_ with _su - ceph_. |
||
182 | |||
183 | 1 | Jessica Mack | A minimal system will have at least one monitor and two OSD daemons for data replication. |
184 | 4 | Jessica Mack | 1. Begin by setting up a Ceph monitor on node1 with the Ceph deployment toolkit. |
185 | @shell> ceph-deploy new node1@ |
||
186 | 1 | Jessica Mack | This will define the name of the initial monitor node and create a default Ceph configuration file and monitor keyring in the current directory. |
187 | |||
188 | !image9.jpg! |
||
189 | |||
190 | 4 | Jessica Mack | 2. Change the number of replicas in the Ceph configuration file at _/home/ceph/my-cluster/ceph.conf_ from 3 to 2 so that Ceph can achieve a stable state with just two OSDs. Add the following line in the [global] section: |
191 | @osd pool default size = 2 |
||
192 | osd pool default min size = 2@ |
||
193 | 3. In the same file, set the OSD journal size. A good general setting is 10 GB; however, since this is a simulation, you can use a smaller amount such as 4 GB. Add the following line in the [global] section: |
||
194 | @osd journal size = 4000@ |
||
195 | 4. In the same file, set the default number of placement groups for a pool. Since we’ll have less than 5 OSDs, 128 placement groups per pool should suffice. Add the following line in the [global] section: |
||
196 | @osd pool default pg num = 128@ |
||
197 | 5. Install Ceph on each node in the cluster, including the administration node. |
||
198 | @shell> ceph-deploy install admin-node node1 node2 node3 node4@ |
||
199 | 1 | Jessica Mack | The Ceph deployment toolkit will now go to work installing Ceph on each node. Here's an example of what you will see during the installation process. |
200 | |||
201 | !image10.jpg! |
||
202 | |||
203 | 4 | Jessica Mack | Create the Ceph monitor on _node1_ and gather the initial keys. |
204 | @shell> ceph-deploy mon create-initial node1@ |
||
205 | 1 | Jessica Mack | |
206 | !image11.jpg! |
||
207 | |||
208 | h3. Step 5: Prepare and Activate OSDs |
||
209 | |||
210 | 4 | Jessica Mack | The next set is to prepare and activate Ceph OSDs. We'll need a minimum of 2 OSDs, and these should be set up on _node2_ and _node3_, as it's not recommended to mix monitors and OSD daemons on the same host. To begin, set up an OSD on _node2_ as follows: |
211 | Log into _node2_ as the _ceph_ user. |
||
212 | @shell> ssh node2@ |
||
213 | 1 | Jessica Mack | Create a directory for the OSD daemon. |
214 | 4 | Jessica Mack | @shell> sudo mkdir /var/local/osd@ |
215 | Log out of _node2_. Then, from the administrative node, prepare and activate the OSD. |
||
216 | @shell> ceph-deploy osd prepare node2:/var/local/osd@ |
||
217 | 1 | Jessica Mack | |
218 | !image12.jpg! |
||
219 | |||
220 | 4 | Jessica Mack | @shell> ceph-deploy osd activate node2:/var/local/osd@ |
221 | 1 | Jessica Mack | |
222 | !image13.jpg! |
||
223 | |||
224 | 4 | Jessica Mack | Repeat the above steps for _node3_. |
225 | 1 | Jessica Mack | At this point, the OSD daemons have been created and the storage cluster is ready. |
226 | |||
227 | h3. Step 6: Verify Cluster Health |
||
228 | |||
229 | Copy the configuration file and admin keyring from the administration node to all the nodes in the cluster. |
||
230 | 4 | Jessica Mack | @shell> ceph-deploy admin admin-node node1 node2 node3 node4@ |
231 | 1 | Jessica Mack | |
232 | !image14.jpg! |
||
233 | |||
234 | 4 | Jessica Mack | Log in to each node as the _ceph_ user and change the permissions of the admin keyring. |
235 | @shell> ssh node1 |
||
236 | shell> sudo chmod +r /etc/ceph/ceph.client.admin.keyring@ |
||
237 | You should now be able to check cluster health from any node in the cluster with the _ceph status_ command. Ideally, you want to see the status _active + clean_, as that indicates the cluster is operating normally. |
||
238 | @shell> ceph status@ |
||
239 | 1 | Jessica Mack | |
240 | !image15.jpg! |
||
241 | |||
242 | h3. Step 7: Test the Cluster |
||
243 | |||
244 | You can now perform a simple test to see the distributed Ceph storage cluster in action, by writing a file on one node and retrieving it on another: |
||
245 | 4 | Jessica Mack | 1. Log in to _node1_ as the _ceph_ user. |
246 | @shell> ssh node1@ |
||
247 | 2. Create a new file with some dummy data. |
||
248 | @shell> echo "Hello world" > /tmp/hello.txt@ |
||
249 | 3. Data is stored in Ceph within storage pools, which are logical groups in which to organize your data. By default, a Ceph storage cluster has 3 pools - _data, metadata_ and _rbd_ - and it's also possible to create your own custom pools. In this case, copy the file to the _data_ pool with the _rados put_ command and assign it a name. |
||
250 | @shell> rados put hello-object /tmp/hello.txt --pool data@ |
||
251 | 1 | Jessica Mack | To verify that the Ceph storage cluster stored the object: |
252 | 4 | Jessica Mack | 1. Log in to _node2_ as the _ceph_ user. |
253 | 2. Check that the file exists in the cluster's _data_ storage pool with the _rados ls_ command. |
||
254 | @shell> rados ls --pool data@ |
||
255 | 3. Copy the file out of the storage cluster to a local directory with the _rados get_ command and verify its contents |
||
256 | @shell> rados get hello-object /tmp/hello.txt --pool data |
||
257 | shell> cat hello.txt@ |
||
258 | 1 | Jessica Mack | |
259 | !image16.jpg! |
||
260 | |||
261 | h3. Step 8: Connect a Ceph Block Device to the Cluster |
||
262 | |||
263 | 4 | Jessica Mack | Now that the cluster is operating, it’s time to do something with it. Ceph storage clusters can be accessed by three types of clients: Ceph block devices, Ceph object gateways and the Ceph filesystem (CephFS). The simplest to demonstrate is the RADOS Block Device (RBD), so in this step you'll create a virtual block device client on _node4_, associate it with a storage pool and then read and write data to it. |
264 | 1. Log in to _node4_ as the _ceph_ user. |
||
265 | @shell> ssh node4@ |
||
266 | 2. Create a pool named _work_. Within the pool, specify the number of placement groups, which are the number of shards or fragments that the pool is divided into. Placement groups are mapped to OSDs and a larger number of placement groups (such as 100 per OSD) leads to better balancing. |
||
267 | @shell> ceph osd pool create work 100 100@ |
||
268 | 3. Create a RADOS Block Device image with rbd and connect it with the pool. |
||
269 | @shell> rbd create image01 --size 1024 --pool work@ |
||
270 | 4. Map the block device image to the actual block device. |
||
271 | @shell> sudo rbd map image01 --pool work --name client.admin@ |
||
272 | 5. Create a filesystem on the block device and mount it. |
||
273 | @shell> sudo /sbin/mkfs.ext4 -m0 /dev/rbd/work/image01 |
||
274 | 1 | Jessica Mack | shell> sudo mkdir /mnt/ceph-block-device |
275 | 4 | Jessica Mack | shell> sudo mount /dev/rbd/work/image01 /mnt/ceph-block-device@ |
276 | 1 | Jessica Mack | |
277 | !image17.jpg! |
||
278 | |||
279 | At this point, your Ceph block device is mounted and ready for use. You can write data to it as with any other block device, and your data will be automatically stored in the cluster (with all its resiliency and scalability benefits). Plus, you get a bunch of cool features, such as the ability to create device snapshots, so you can easily roll back to a previous image of the device. |
||
280 | To demonstrate this: |
||
281 | 4 | Jessica Mack | 1. Navigate to where you mounted the block device and create a text file. You might need to first change the permissions of the mount point so that you can write to it. |
282 | @shell> cd /mnt/ceph-block-device |
||
283 | shell> vi today.txt@ |
||
284 | 2. Add the word 'Monday' to the file. |
||
285 | 3. Then, take a snapshot of the image. |
||
286 | @shell> rbd --pool work snap create --snap monday image01@ |
||
287 | 4. Edit the file again and this time change the contents to 'Friday'. |
||
288 | 5. Unmount the block device. Then, roll back to the previous snapshot and mount it again. |
||
289 | @shell> sudo umount /mnt/ceph-block-device |
||
290 | 1 | Jessica Mack | shell> rbd --pool work snap rollback --snap monday image01 |
291 | 4 | Jessica Mack | shell> sudo mount /dev/rbd/work/image01 /mnt/ceph-block-device@ |
292 | 1 | Jessica Mack | When you inspect the contents of the file, you will see the original contents restored. |
293 | 4 | Jessica Mack | |
294 | 1 | Jessica Mack | You must unmount the block device before doing a rollback. If it's mounted when you rollback, the client will have a stale cache, which may cause filesystem corruption, since it's like a hard drive being written by two machines at once. |
295 | |||
296 | 2 | Jessica Mack | h3. Conclusion |
297 | |||
298 | 4 | Jessica Mack | Just as you can use Ceph block devices, so too can you use the "Ceph Object Gateway":http://ceph.com/docs/master/radosgw/ to create Amazon S3-style storage buckets accessible via REST, or "CephFS":http://ceph.com/docs/master/cephfs/ as a POSIX-compliant scalable, fault-tolerant network filesystem. Setting these up is beyond the scope of this beginner tutorial, but since they both use the Ceph storage cluster that you've already configured, it won't take long for you to get them running. Review "instructions for the Ceph Object Gateway":http://ceph.com/docs/master/start/quick-rgw/ and "instructions for CephFS":http://ceph.com/docs/master/start/quick-cephfs/. |
299 | 2 | Jessica Mack | As this tutorial will have illustrated, Ceph is a powerful solution for creating resilient, infinitely scalable storage. The simple storage cluster you created here with VirtualBox is just the tip of the iceberg: as you transition the cluster to your network or the cloud and add more nodes, you'll benefit from improved performance and flexibility without any loss in reliability and security. What more could you ask for? |
300 | |||
301 | h3. Read More |
||
302 | |||
303 | * "Introduction to Ceph":http://ceph.com/docs/master/start/intro/ |
||
304 | 1 | Jessica Mack | * "Ceph Architecture":http://ceph.com/docs/master/architecture/ |
305 | * "Ceph Storage Cluster Quick Start":http://ceph.com/docs/master/rados/ |
||
306 | * "Getting Started With Ceph":http://www.inktank.com/resource/getting-started-with-ceph-miroslav-klivansky/ |
||
307 | * "Introduction to Ceph & OpenStack":http://www.inktank.com/resource/introduction-to-ceph-openstack-miroslav-klivansky/ |
||
308 | * "Managing A Distributed Storage System At Scale":http://www.inktank.com/resource/managing-a-distributed-storage-system-at-scale-sage-weil/ |
||
309 | * "Scaling Storage With Ceph":http://www.inktank.com/resource/scaling-storage-with-ceph-ross-turk/ |
||
310 | * "Ceph API Documentation":http://ceph.com/docs/master/api/ |