Support #23455
closedosd: large number of inconsistent objects after recover or backfilling
0%
Description
large number of inconsistent objects after recover or backfilling.
reproduce method:
1) create rbd volume and, map to a client. use fio to write data to the volume, as below
[job]
ioengine=libaio
direct=1
filename=/dev/rbd0
rw=randwrite
norandommap
bs=8k
iodepth=2
random_distribution=zipf:1.2
time_based
runtime=360000
2) stop serveral osds
3) create snapshot for this volume
4) and then start osds again
5) finally, do deep-scrub and the inconsistent occurs
like below:
host=$(( $i % 2 + 1))
ssh jewel-ceph-${host} "systemctl stop ceph-osd.target"
echo "stop osd on jewel-ceph-${host}"
sleep 60
rbd snap create nyao@${i}
sleep 60
rbd snap create nyao@s${i}
sleep 60
rbd snap create nyao@ss${i}
sleep 60
rbd snap create nyao@sss${i}
echo "create snap nyao@${i}"
sleep 120
ssh jewel-ceph-${host} "systemctl start ceph-osd.target"
echo "start osd on jewel-ceph-${host}"
for j in {0..5};
do
ceph osd deep-scrub osd.${j}
done
echo "scrub all osd"
sleep 3600
it seems if I disable fiemap features, then no inconsistent object any more.