Bug #12748
Updated by Loïc Dachary over 8 years ago
From mailing list, courtesy of Ilya: I think I reproduced this on today's master. Setup, cache mode is writeback: <pre> $ ./ceph osd pool create foo 12 12 pool 'foo' created $ ./ceph osd pool create foo-hot 12 12 pool 'foo-hot' created $ ./ceph osd tier add foo foo-hot pool 'foo-hot' is now (or already was) a tier of 'foo' $ ./ceph osd tier cache-mode foo-hot writeback set cache-mode for pool 'foo-hot' to writeback $ ./ceph osd tier set-overlay foo foo-hot overlay for 'foo' is now (or already was) 'foo-hot' </pre> Create an image: $ ./rbd create --size 10M --image-format 2 foo/bar $ sudo ./rbd-fuse -p foo -c $PWD/ceph.conf /mnt $ sudo mkfs.ext4 /mnt/bar $ sudo umount /mnt Create a snapshot, take md5sum: $ ./rbd snap create foo/bar@snap $ ./rbd export foo/bar /tmp/foo-1 Exporting image: 100% complete...done. $ ./rbd export foo/bar@snap /tmp/snap-1 Exporting image: 100% complete...done. $ md5sum /tmp/foo-1 83f5d244bb65eb19eddce0dc94bf6dda /tmp/foo-1 $ md5sum /tmp/snap-1 83f5d244bb65eb19eddce0dc94bf6dda /tmp/snap-1 Set the cache mode to forward and do a flush, hashes don't match - the snap is empty - we bang on the hot tier and don't get redirected to the cold tier, I suspect: $ ./ceph osd tier cache-mode foo-hot forward *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** set cache-mode for pool 'foo-hot' to forward $ ./rados -p foo-hot cache-flush-evict-all rbd_data.100a6b8b4567.0000000000000002 rbd_id.bar rbd_directory rbd_header.100a6b8b4567 bar.rbd rbd_data.100a6b8b4567.0000000000000001 rbd_data.100a6b8b4567.0000000000000000 $ ./rados -p foo-hot cache-flush-evict-all $ ./rbd export foo/bar /tmp/foo-2 Exporting image: 100% complete...done. $ ./rbd export foo/bar@snap /tmp/snap-2 Exporting image: 100% complete...done. $ md5sum /tmp/foo-2 83f5d244bb65eb19eddce0dc94bf6dda /tmp/foo-2 $ md5sum /tmp/snap-2 f1c9645dbc14efddc7d8a322685f26eb /tmp/snap-2 $ od /tmp/snap-2 0000000 000000 000000 000000 000000 000000 000000 000000 000000 * 50000000 Disable the cache tier and we are back to normal: $ ./ceph osd tier remove-overlay foo *** DEVELOPER MODE: setting PATH, PYTHONPATH and LD_LIBRARY_PATH *** there is now (or already was) no overlay for 'foo' $ ./rbd export foo/bar /tmp/foo-3 Exporting image: 100% complete...done. $ ./rbd export foo/bar@snap /tmp/snap-3 Exporting image: 100% complete...done. $ md5sum /tmp/foo-3 83f5d244bb65eb19eddce0dc94bf6dda /tmp/foo-3 $ md5sum /tmp/snap-3 83f5d244bb65eb19eddce0dc94bf6dda /tmp/snap-3 I first reproduced it with the kernel client, rbd export was just to take it out of the equation. Also, Igor sort of raised a question in his second message: if, after setting the cache mode to forward and doing a flush, I open an image (not a snapshot, so may not be related to the above) for write (e.g. with rbd-fuse), I get an rbd header object in the hot pool, even though it's in forward mode: $ sudo ./rbd-fuse -p foo -c $PWD/ceph.conf /mnt $ sudo mount /mnt/bar /media $ sudo umount /media $ sudo umount /mnt $ ./rados -p foo-hot ls rbd_header.100a6b8b4567 $ ./rados -p foo ls | grep rbd_header rbd_header.100a6b8b4567 It's been a while since I looked into tiering, is that how it's supposed to work? It looks like it happens because rbd_header op replies don't redirect? Thanks, Ilya osd logs generated from using these instructions suggest that the initial bug is that find_object_context does not fill in pmissing for the snap read since it has a cached snapset and objectcontext. The snapset should be ignored in this case.