Bug #62438
openobject-map rebuild might cause rbd data corruption
0%
Description
The following data corruption scenario is 100% reproducible in my vstart cluster (built from main branch). Seen reports for the same issue for real 17.2.5 clusters.
1) ceph osd pool create rbd 64 64; rbd create --image-feature layering,exclusive-lock -s 10G rbd/corruption ; rbd map rbd/corruption; mkfs.ext4 /dev/rbd0 ; rbd feature enable rbd/corruption object-map,fast-diff
2) rbd object-map rebuild rbd/corruption; mount /dev/rbd0 /mnt/rbd0 ; dd if=/dev/urandom of=/mnt/rbd0/random bs=4k count=1000000 ; umount /mnt/rbd0 ; rbd unmap /dev/rbd0 ;
3) rbd map rbd/corruption ; e2fsck -n -f /dev/rbd0 ; rbd unmap /dev/rbd0
As a result e2fsck from step 3 reports errors:
e2fsck 1.46.5 (30-Dec-2021)
Pass 1: Checking inodes, blocks, and sizes
Inode 12 has an invalid extent node (blk 33795, lblk 0)
Clear? no
Inode 12 extent tree (at level 1) could be shorter. Optimize? no
Inode 12, i_blocks is 8000008, should be 0. Fix? no
Pass 2: Checking directory structure
Pass 3: Checking directory connectivity
Pass 4: Checking reference counts
Pass 5: Checking group summary information
Block bitmap differences: -(9264--31791) -33795 -(34816--98303) -(100352--163839) -(165888--229375) -(231424--294911) -(296960--524287) -(532512--555039) -(557056--819199) -(821248--884735) -(886784--969279) -(2555904--2621439)
Fix? no
/dev/rbd0: *** WARNING: Filesystem still has errors ***