Project

General

Profile

Actions

Support #23455

closed

osd: large number of inconsistent objects after recover or backfilling

Added by Yao Ning about 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Component(RADOS):
OSD
Pull request ID:

Description

large number of inconsistent objects after recover or backfilling.

reproduce method:

1) create rbd volume and, map to a client. use fio to write data to the volume, as below

[job]
ioengine=libaio
direct=1
filename=/dev/rbd0
rw=randwrite
norandommap
bs=8k
iodepth=2
random_distribution=zipf:1.2
time_based
runtime=360000

2) stop serveral osds
3) create snapshot for this volume
4) and then start osds again
5) finally, do deep-scrub and the inconsistent occurs

like below:
host=$(( $i % 2 + 1))

ssh jewel-ceph-${host} "systemctl stop ceph-osd.target"
echo "stop osd on jewel-ceph-${host}"
sleep 60

rbd snap create nyao@${i}
sleep 60
rbd snap create nyao@s${i}
sleep 60
rbd snap create nyao@ss${i}
sleep 60
rbd snap create nyao@sss${i}
echo "create snap nyao@${i}"

sleep 120
ssh jewel-ceph-${host} "systemctl start ceph-osd.target"
echo "start osd on jewel-ceph-${host}"

for j in {0..5};
do
ceph osd deep-scrub osd.${j}
done
echo "scrub all osd"
sleep 3600

it seems if I disable fiemap features, then no inconsistent object any more.

Actions

Also available in: Atom PDF