Bug #12695
closedrbd flatten makes clone's snap crash
0%
Description
ceph version:0.94.2
cluster:3 nodes,each node has 10 OSDs and 1 mon.
Operation?
(1)rbd create img1 --size 256 --image-format 2
(2)rbd map img1 ---to-->/dev/rbd1
(3)mkfs.xfs /dev/rbd1
(4)rbd unmap /dev/rbd1
(5)rbd snap create img1@snap_img1
(6)rbd snap protect img1@snap_img1
(7)rbd clone img1@snap_img1 clone1_img1
(8)rbd snap create clone1_img1@snap_clone1
(9)rbd map clone1_img1@snap_clone1 ---to--->/dev/rbd1
(10)mount -o ro /dev/rbd1 /mnt (success!)
(11)umount /dev/rbd1
(12)rbd unmap /dev/rbd1
(13)rbd flatten clone1_img1
(14)rbd map clone1_img1@snap_clone1 ---to--->/dev/rbd1
(15)mount o ro /dev/rbd1 /mnt (failed!) -> mount: unknown filesystem type '(null)'
Files
Updated by Jason Dillaman over 8 years ago
- Project changed from Ceph to Linux kernel client
- Category deleted (
librbd)
Updated by Ilya Dryomov over 8 years ago
- Status changed from New to Closed
- Assignee set to Ilya Dryomov
This was fixed in kernel 3.17, commit 4d9b67cddd9b ("rbd: take snap_id into account when reading in parent info"). AFAIR backporting it didn't look feasible, so older upstream kernels are stuck with this bug.
Updated by science luo over 8 years ago
Ilya Dryomov wrote:
This was fixed in kernel 3.17, commit 4d9b67cddd9b ("rbd: take snap_id into account when reading in parent info"). AFAIR backporting it didn't look feasible, so older upstream kernels are stuck with this bug.
But in another cluster this bug can't be reproducted.
Another cluster:
ceph version 0.87.2 (87a7cec9ab11c677de2ab23a7668a77d2f5b955e)
uname -a Linux ceph10 3.10.0-123.el7.x86_64 #1 SMP Thu Apr 30 13:53:41 CST 2015 x86_64 x86_64 x86_64 GNU/Linux
Updated by Ilya Dryomov over 8 years ago
science luo wrote:
Ilya Dryomov wrote:
This was fixed in kernel 3.17, commit 4d9b67cddd9b ("rbd: take snap_id into account when reading in parent info"). AFAIR backporting it didn't look feasible, so older upstream kernels are stuck with this bug.
But in another cluster this bug can't be reproducted.
Another cluster:
ceph version 0.87.2 (87a7cec9ab11c677de2ab23a7668a77d2f5b955e)
This a kernel client bug, so ceph version is of no importance.
uname -a Linux ceph10 3.10.0-123.el7.x86_64 #1 SMP Thu Apr 30 13:53:41 CST 2015 x86_64 x86_64 x86_64 GNU/Linux
This is a Red Hat kernel, not upstream. It has a bunch of stuff backported, including 4d9b67cddd9b.
Updated by science luo over 8 years ago
Ilya Dryomov wrote:
science luo wrote:
Ilya Dryomov wrote:
This was fixed in kernel 3.17, commit 4d9b67cddd9b ("rbd: take snap_id into account when reading in parent info"). AFAIR backporting it didn't look feasible, so older upstream kernels are stuck with this bug.
But in another cluster this bug can't be reproducted.
Another cluster:
ceph version 0.87.2 (87a7cec9ab11c677de2ab23a7668a77d2f5b955e)This a kernel client bug, so ceph version is of no importance.
uname -a Linux ceph10 3.10.0-123.el7.x86_64 #1 SMP Thu Apr 30 13:53:41 CST 2015 x86_64 x86_64 x86_64 GNU/Linux
This is a Red Hat kernel, not upstream. It has a bunch of stuff backported, including 4d9b67cddd9b.
OK?I get it?thanks.
Updated by science luo over 8 years ago
Hi,thanks for your reply!
I have merge the code from 3.16 to 3.10.But it also have the same problem.
I have change the cobe below in 3.10 rbd.c
- snapid = cpu_to_le64(CEPH_NOSNAP);
+ snapid = cpu_to_le64(rbd_dev->spec->snap_id);
ret = rbd_obj_method_sync(rbd_dev, rbd_dev->header_name,
I have update the rbd.c code to the web.
Updated by Ilya Dryomov over 8 years ago
If it was that simple, it would have been backported :-)
At the very least you are going to need e8f59b595d05 ("rbd: do not read in parent info before snap context"), possibly more.