Actions
Bug #56181
closedmirror snapshot syncing progress reporting (last_copied_object_number) is broken for sparse images
% Done:
0%
Source:
Tags:
Backport:
octopus,pacific,quincy
Regression:
Yes
Severity:
3 - minor
Reviewed:
Description
$ rbd create --size 2G data/testimg $ rbd mirror image enable data/testimg snapshot Mirroring enabled $ sudo rbd device map data/testimg $ dd if=/dev/urandom of=/dev/rbd0 bs=4M oflag=direct dd: error writing '/dev/rbd0': No space left on device 513+0 records in 512+0 records out 2147483648 bytes (2.1 GB, 2.0 GiB) copied, 37.0197 s, 58.0 MB/s $ sudo rbd device unmap data/testimg $ rbd mirror image snapshot data/testimg Snapshot ID: 8
Monitor last_copied_object_number and complete fields on the secondary cluster:
$ rbd snap ls --all --format=json data/testimg | jq '.[-1] | .namespace.last_copied_object_number,.namespace.complete'
For a fully allocated image, last_copied_object_number is gradually increasing from 1 to 512 (image size / object size). The snapshot is marked complete when last_copied_object_number reaches 512, as expected:
1 false --- 19 false --- 42 false --- 50 false --- 72 false --- 92 false --- 107 false --- 124 false --- 142 false --- 161 false --- 181 false --- 200 false --- 214 false --- 235 false --- 255 false --- 270 false --- 288 false --- 305 false --- 320 false --- 338 false --- 355 false --- 370 false --- 386 false --- 402 false --- 418 false --- 432 false --- 451 false --- 465 false --- 484 false --- 502 false --- 512 true --- 512 true --- 512 true
For an image with holes in it, last_copied_object_number gets tripped over the first hole. The sync continues but last_copied_object_number isn't updated. Eventually the snapshot is marked complete with last_copied_object_number still stuck:
$ rbd create --size 2G data/testimg $ rbd mirror image enable data/testimg snapshot Mirroring enabled $ sudo rbd device map data/testimg $ dd if=/dev/urandom of=/dev/rbd0 bs=4M count=100 oflag=direct 100+0 records in 100+0 records out 419430400 bytes (419 MB, 400 MiB) copied, 15.9774 s, 26.3 MB/s $ dd if=/dev/urandom of=/dev/rbd0 bs=4M seek=110 oflag=direct dd: error writing '/dev/rbd0': No space left on device 403+0 records in 402+0 records out 1686110208 bytes (1.7 GB, 1.6 GiB) copied, 65.5053 s, 25.7 MB/s $ sudo rbd device unmap data/testimg $ rbd mirror image snapshot data/testimg Snapshot ID: 16
1 false --- 5 false --- 29 false --- 40 false --- 51 false --- 64 false --- 73 false --- 85 false --- 87 false --- 99 false --- 99 false --- 99 false --- [...] --- 99 false --- 99 false --- 99 false --- 100 true --- 100 true --- 100 true
This can also be observed by monitoring syncing_percent field in "rbd mirror image status" output.
Actions