Project

General

Profile

Actions

Bug #12695

closed

rbd flatten makes clone's snap crash

Added by science luo over 8 years ago. Updated over 8 years ago.

Status:
Closed
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

ceph version:0.94.2

cluster:3 nodes,each node has 10 OSDs and 1 mon.

Operation?

(1)rbd create img1 --size 256 --image-format 2

(2)rbd map img1 ---to-->/dev/rbd1

(3)mkfs.xfs /dev/rbd1

(4)rbd unmap /dev/rbd1

(5)rbd snap create img1@snap_img1

(6)rbd snap protect img1@snap_img1

(7)rbd clone img1@snap_img1 clone1_img1

(8)rbd snap create clone1_img1@snap_clone1

(9)rbd map clone1_img1@snap_clone1 ---to--->/dev/rbd1

(10)mount -o ro /dev/rbd1 /mnt (success!)

(11)umount /dev/rbd1

(12)rbd unmap /dev/rbd1

(13)rbd flatten clone1_img1

(14)rbd map clone1_img1@snap_clone1 ---to--->/dev/rbd1

(15)mount o ro /dev/rbd1 /mnt (failed!) -> mount: unknown filesystem type '(null)'


Files

rbd.c (137 KB) rbd.c science luo, 09/17/2015 08:36 AM
Actions #1

Updated by Jason Dillaman over 8 years ago

  • Project changed from Ceph to Linux kernel client
  • Category deleted (librbd)
Actions #2

Updated by Ilya Dryomov over 8 years ago

  • Status changed from New to Closed
  • Assignee set to Ilya Dryomov

This was fixed in kernel 3.17, commit 4d9b67cddd9b ("rbd: take snap_id into account when reading in parent info"). AFAIR backporting it didn't look feasible, so older upstream kernels are stuck with this bug.

Actions #3

Updated by science luo over 8 years ago

Ilya Dryomov wrote:

This was fixed in kernel 3.17, commit 4d9b67cddd9b ("rbd: take snap_id into account when reading in parent info"). AFAIR backporting it didn't look feasible, so older upstream kernels are stuck with this bug.

But in another cluster this bug can't be reproducted.
Another cluster:
ceph version 0.87.2 (87a7cec9ab11c677de2ab23a7668a77d2f5b955e)
uname -a Linux ceph10 3.10.0-123.el7.x86_64 #1 SMP Thu Apr 30 13:53:41 CST 2015 x86_64 x86_64 x86_64 GNU/Linux

Actions #4

Updated by Ilya Dryomov over 8 years ago

science luo wrote:

Ilya Dryomov wrote:

This was fixed in kernel 3.17, commit 4d9b67cddd9b ("rbd: take snap_id into account when reading in parent info"). AFAIR backporting it didn't look feasible, so older upstream kernels are stuck with this bug.

But in another cluster this bug can't be reproducted.
Another cluster:
ceph version 0.87.2 (87a7cec9ab11c677de2ab23a7668a77d2f5b955e)

This a kernel client bug, so ceph version is of no importance.

uname -a Linux ceph10 3.10.0-123.el7.x86_64 #1 SMP Thu Apr 30 13:53:41 CST 2015 x86_64 x86_64 x86_64 GNU/Linux

This is a Red Hat kernel, not upstream. It has a bunch of stuff backported, including 4d9b67cddd9b.

Actions #5

Updated by science luo over 8 years ago

Ilya Dryomov wrote:

science luo wrote:

Ilya Dryomov wrote:

This was fixed in kernel 3.17, commit 4d9b67cddd9b ("rbd: take snap_id into account when reading in parent info"). AFAIR backporting it didn't look feasible, so older upstream kernels are stuck with this bug.

But in another cluster this bug can't be reproducted.
Another cluster:
ceph version 0.87.2 (87a7cec9ab11c677de2ab23a7668a77d2f5b955e)

This a kernel client bug, so ceph version is of no importance.

uname -a Linux ceph10 3.10.0-123.el7.x86_64 #1 SMP Thu Apr 30 13:53:41 CST 2015 x86_64 x86_64 x86_64 GNU/Linux

This is a Red Hat kernel, not upstream. It has a bunch of stuff backported, including 4d9b67cddd9b.

OK?I get it?thanks.

Actions #6

Updated by science luo over 8 years ago

Hi,thanks for your reply!

I have merge the code from 3.16 to 3.10.But it also have the same problem.

I have change the cobe below in 3.10 rbd.c

- snapid = cpu_to_le64(CEPH_NOSNAP);
+ snapid = cpu_to_le64(rbd_dev->spec->snap_id);
ret = rbd_obj_method_sync(rbd_dev, rbd_dev->header_name,

I have update the rbd.c code to the web.

Actions #7

Updated by Ilya Dryomov over 8 years ago

If it was that simple, it would have been backported :-)
At the very least you are going to need e8f59b595d05 ("rbd: do not read in parent info before snap context"), possibly more.

Actions

Also available in: Atom PDF