Bug #61891
openParent data is not copied up when cloned images are mirrored.
0%
Description
Steps :
Set up 2 ceph clusters for rbd mirroring.
$ bin/rbd --cluster site-a create -s 8M data/src1
$ bin/rbd --cluster site-a map -t nbd data/src1
$ xfs_io -d -c 'pwrite -S 0xadad -b 4M 0 5M' /dev/nbd0
$ bin/rbd --cluster site-a snap create data/src1@snap1
$ bin/rbd --cluster site-a snap protect data/src1@snap1
$ bin/rbd --cluster site-a clone data/src1@snap1 data/dst1
$ bin/rbd --cluster site-a unmap -t nbd data/src1
$ bin/rbd --cluster site-a map -t nbd data/dst1
#Enable mirroring on both parent and child images
$ bin/rbd --cluster site-a mirror image enable data/src1 snapshot
$ bin/rbd --cluster site-a mirror image enable data/dst1 snapshot
#Wait until synced
$ xfs_io -d -c 'pwrite -S 0x11 -b 4M 7M 1M' /dev/nbd0
$ xfs_io -d -c 'pread -v 4M 512' /dev/nbd0
00400000: ad ad ad ad ad ad ad ad ad ad ad ad ad ad ad ad ................
00400010: ad ad ad ad ad ad ad ad ad ad ad ad ad ad ad ad ................
00400020: ad ad ad ad ad ad ad ad ad ad ad ad ad ad ad ad ................
00400030: ad ad ad ad ad ad ad ad ad ad ad ad ad ad ad ad ................
00400040: ad ad ad ad ad ad ad ad ad ad ad ad ad ad ad ad ................
...
$ bin/rbd --cluster site-a mirror image snapshot data/dst1
$ bin/rbd --cluster site-a unmap -t nbd data/dst1
$ bin/rbd --cluster site-a mirror image demote data/dst1
$ bin/rbd --cluster site-b mirror image promote data/dst1
$ bin/rbd --cluster site-b map -t nbd data/dst1
/dev/nb0
#Data written directly to the clone has been copied
$ xfs_io -d -c 'pread -v 7M 512' /dev/nbd0
00700000: 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 ................
00700010: 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 ................
00700020: 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 ................
00700030: 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 ................
00700040: 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 ................
00700050: 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 ................
00700060: 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 11 ................
...
#The parent data has not been copied up to object 1:
$ xfs_io -d -c 'pread -v 4M 512' /dev/nbd0
00400000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00400010: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00400020: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00400030: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00400040: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00400050: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
00400060: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
...
Updated by Ilya Dryomov 10 months ago
- Status changed from New to In Progress
- Assignee set to Nithya Balachandran
- Priority changed from Normal to Urgent
Updated by Nithya Balachandran 10 months ago
RCA:
As the RBD mirroring performs incremental deep copies, only a subset of the snap_ids is sent to theObjectListSnapsRequest<I>::handle_list_snaps() call. The changes to the objects made before the snapshots not in the incremental set are not included in the snapshot_delta. The parent data is therefore not part of the snapshot_delta.
ObjectCopyRequest does not use the librbd function calls which would have handled the copyup from the parent and instead writes the data to the object using rados ops. The parent data is not copied to the clone object causing a mismatch in the data.
Updated by Nithya Balachandran 4 months ago
To reproduce this issue, ensure that the clone image is mirrored before it is written to. The issue is not seen if the clone is mirrored only after it the write operation is complete.