Bug #54970
rbd diff of a clone misses blocks used by the image, referenced to the parent
0%
Description
Hi,
Ceph image backups are inconsistent due to `rbd diff` unfortunately reporting blocks as unallocated when they are referenced by the parent image. This only happens when the '--whole-object' parameter is passed though.
This area of the source code was last updated in the following two discussions:
We have a write protected base image snap which is cloned for VMs. Herewith the structure:
[admin@kvm5i ~]# rbd ls rbd_ssd -l | grep 401 vm-401-disk-0 40 GiB rbd_ssd/base-441-disk-0@__base__ 2 excl vm-401-disk-0@b-2022-03-20T11:12:39 40 GiB rbd_ssd/base-441-disk-0@__base__ 2
Reporting allocated blocks on the read only parent snapshot shows 4 MiB blocks 0, 1, 2 and 65 being allocated:
[admin@kvm5i ~]# rbd diff rbd_ssd/base-441-disk-0@__base__ | head -n 5 Offset Length Type 0 4194304 data 4194304 4194304 data 8388608 2977792 data 272629760 4194304 data
The result is the same for the VM image and it's backup snapshot:
[admin@kvm5i ~]# rbd diff rbd_ssd/vm-401-disk-0 | head -n 5 Offset Length Type 0 4194304 data 4194304 4194304 data 8388608 2977792 data 272629760 4194304 data [admin@kvm5i ~]# rbd diff rbd_ssd/vm-401-disk-0@b-2022-03-20T11:12:39 | head -n 5 Offset Length Type 0 4194304 data 4194304 4194304 data 8388608 2977792 data 272629760 4194304 data
This is all 100% perfect, adding the '--whole-object' switch however unfortunately then yield inconsistent results. The operation is correct when run against the parent, but not the VM clone or its snapshot:
[admin@kvm5i ~]# rbd diff --whole-object rbd_ssd/base-441-disk-0@__base__ | head -n 5 Offset Length Type 0 4194304 data 4194304 4194304 data 8388608 4194304 data 272629760 4194304 data [admin@kvm5i ~]# rbd diff --whole-object rbd_ssd/vm-401-disk-0 | head -n 5 Offset Length Type 0 4194304 data 272629760 4194304 data 276824064 4194304 data 281018368 4194304 data [admin@kvm5i ~]# rbd diff --whole-object rbd_ssd/vm-401-disk-0@b-2022-03-20T11:12:39 | head -n 5 Offset Length Type 0 4194304 data 272629760 4194304 data 276824064 4194304 data 281018368 4194304 data
We can confirm that the snapshots reference data that should be included in the ceph diff output:
[admin@kvm5i ~]# rbd map rbd_ssd/vm-401-disk-0@b-2022-03-20T11:12:39 /dev/rbd40 [admin@kvm5i ~]# perl -ne 'use Digest::SHA qw(sha1_base64);BEGIN{$/=\4194304};print sha1_base64($_)."\n"' /dev/rbd40 | head -n 66 +qnB9vP9IyPSqfR+ylHxyZHg8uY ArU8NpEhyVq8OZocpMma5AECdQk I7Wld9JtQMqPTU8WLGLmu06rJ4w K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M ajRKjn3h/rw8ECEAAV2na/0QQ/s [admin@kvm5i ~]# rbd unmap /dev/rbd40
PS: 4 MiB of zeros = SHA1 sum of 'K8y9LzjxXBPrfVqJ/Z2F9ZXiO8M'
Related issues
History
#1 Updated by Ilya Dryomov over 1 year ago
- Assignee set to Ilya Dryomov
Hi David,
This would be fixed in the upcoming 16.2.8, see https://tracker.ceph.com/issues/53838. Sorry for the mess in this area!
#2 Updated by Ilya Dryomov over 1 year ago
While 16.2.8 isn't out yet, the fix is present in the Quincy release candidate (17.1.0). It would be great if you could install it on the client (perhaps in a throwaway container/VM, upgrading the entire cluster isn't necessary!) and verify that the issue is taken care of.
#3 Updated by Ilya Dryomov over 1 year ago
- Status changed from New to Duplicate
Hi David,
16.2.8 was released earlier today, please give it a try and reopen if the issue persists.
#4 Updated by Ilya Dryomov over 1 year ago
- Duplicates Bug #53787: diff-iterate include_parent functionality is broken in fast-diff mode added