Project

General

Profile

Bug #54970

rbd diff of a clone misses blocks used by the image, referenced to the parent

Added by David Herselman 6 months ago. Updated 5 months ago.

Status:
Duplicate
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

Ceph image backups are inconsistent due to `rbd diff` unfortunately reporting blocks as unallocated when they are referenced by the parent image. This only happens when the '--whole-object' parameter is passed though.

This area of the source code was last updated in the following two discussions:

We have a write protected base image snap which is cloned for VMs. Herewith the structure:

[admin@kvm5i ~]# rbd ls rbd_ssd -l | grep 401
vm-401-disk-0                            40 GiB  rbd_ssd/base-441-disk-0@__base__    2        excl
vm-401-disk-0@b-2022-03-20T11:12:39      40 GiB  rbd_ssd/base-441-disk-0@__base__    2

Reporting allocated blocks on the read only parent snapshot shows 4 MiB blocks 0, 1, 2 and 65 being allocated:

[admin@kvm5i ~]# rbd diff rbd_ssd/base-441-disk-0@__base__ | head -n 5
Offset       Length   Type
0            4194304  data
4194304      4194304  data
8388608      2977792  data
272629760    4194304  data

The result is the same for the VM image and it's backup snapshot:

[admin@kvm5i ~]# rbd diff rbd_ssd/vm-401-disk-0 | head -n 5
Offset       Length   Type
0            4194304  data
4194304      4194304  data
8388608      2977792  data
272629760    4194304  data
[admin@kvm5i ~]# rbd diff rbd_ssd/vm-401-disk-0@b-2022-03-20T11:12:39 | head -n 5
Offset       Length   Type
0            4194304  data
4194304      4194304  data
8388608      2977792  data
272629760    4194304  data

This is all 100% perfect, adding the '--whole-object' switch however unfortunately then yield inconsistent results. The operation is correct when run against the parent, but not the VM clone or its snapshot:

[admin@kvm5i ~]# rbd diff --whole-object rbd_ssd/base-441-disk-0@__base__ | head -n 5
Offset       Length   Type
0            4194304  data
4194304      4194304  data
8388608      4194304  data
272629760    4194304  data
[admin@kvm5i ~]# rbd diff --whole-object rbd_ssd/vm-401-disk-0 | head -n 5
Offset       Length   Type
0            4194304  data
272629760    4194304  data
276824064    4194304  data
281018368    4194304  data
[admin@kvm5i ~]# rbd diff --whole-object rbd_ssd/vm-401-disk-0@b-2022-03-20T11:12:39 | head -n 5
Offset       Length   Type
0            4194304  data
272629760    4194304  data
276824064    4194304  data
281018368    4194304  data

We can confirm that the snapshots reference data that should be included in the ceph diff output:

[admin@kvm5i ~]# rbd map rbd_ssd/vm-401-disk-0@b-2022-03-20T11:12:39
/dev/rbd40
[admin@kvm5i ~]# perl -ne 'use Digest::SHA qw(sha1_base64);BEGIN{$/=\4194304};print sha1_base64($_)."\n"' /dev/rbd40 | head -n 66
+qnB9vP9IyPSqfR+ylHxyZHg8uY
ArU8NpEhyVq8OZocpMma5AECdQk
I7Wld9JtQMqPTU8WLGLmu06rJ4w
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M
ajRKjn3h/rw8ECEAAV2na/0QQ/s
[admin@kvm5i ~]# rbd unmap /dev/rbd40

PS: 4 MiB of zeros = SHA1 sum of 'K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M'


Related issues

Duplicates rbd - Bug #53787: diff-iterate include_parent functionality is broken in fast-diff mode Resolved

History

#1 Updated by Ilya Dryomov 6 months ago

  • Assignee set to Ilya Dryomov

Hi David,

This would be fixed in the upcoming 16.2.8, see https://tracker.ceph.com/issues/53838. Sorry for the mess in this area!

#2 Updated by Ilya Dryomov 6 months ago

While 16.2.8 isn't out yet, the fix is present in the Quincy release candidate (17.1.0). It would be great if you could install it on the client (perhaps in a throwaway container/VM, upgrading the entire cluster isn't necessary!) and verify that the issue is taken care of.

#3 Updated by Ilya Dryomov 5 months ago

  • Status changed from New to Duplicate

Hi David,

16.2.8 was released earlier today, please give it a try and reopen if the issue persists.

#4 Updated by Ilya Dryomov 5 months ago

  • Duplicates Bug #53787: diff-iterate include_parent functionality is broken in fast-diff mode added

Also available in: Atom PDF