Project

General

Profile

Support #23455

osd: large number of inconsistent objects after recover or backfilling

Added by Yao Ning over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Component(RADOS):
OSD
Pull request ID:

Description

large number of inconsistent objects after recover or backfilling.

reproduce method:

1) create rbd volume and, map to a client. use fio to write data to the volume, as below

[job]
ioengine=libaio
direct=1
filename=/dev/rbd0
rw=randwrite
norandommap
bs=8k
iodepth=2
random_distribution=zipf:1.2
time_based
runtime=360000

2) stop serveral osds
3) create snapshot for this volume
4) and then start osds again
5) finally, do deep-scrub and the inconsistent occurs

like below:
host=$(( $i % 2 + 1))

ssh jewel-ceph-${host} "systemctl stop ceph-osd.target"
echo "stop osd on jewel-ceph-${host}"
sleep 60

rbd snap create nyao@${i}
sleep 60
rbd snap create nyao@s${i}
sleep 60
rbd snap create nyao@ss${i}
sleep 60
rbd snap create nyao@sss${i}
echo "create snap nyao@${i}"

sleep 120
ssh jewel-ceph-${host} "systemctl start ceph-osd.target"
echo "start osd on jewel-ceph-${host}"

for j in {0..5};
do
ceph osd deep-scrub osd.${j}
done
echo "scrub all osd"
sleep 3600

it seems if I disable fiemap features, then no inconsistent object any more.

History

#1 Updated by Yao Ning over 3 years ago

it seems quite similar with issue http://tracker.ceph.com/issues/21388

#2 Updated by Yao Ning over 3 years ago

it is also affected for v10.2.5. And just affect all snap object, and no head object is affected

#3 Updated by Yao Ning over 3 years ago

also affected head object, but very small number of portion.

#4 Updated by Yao Ning over 3 years ago

  • Assignee deleted (Sage Weil)
  • Priority changed from Urgent to Normal

#5 Updated by Patrick Donnelly over 3 years ago

  • Project changed from Ceph to RADOS
  • Subject changed from large number of inconsistent objects after recover or backfilling to osd: large number of inconsistent objects after recover or backfilling
  • Category deleted (OSD)
  • Source set to Community (user)
  • Release deleted (jewel)
  • Component(RADOS) OSD added

#6 Updated by Greg Farnum over 3 years ago

  • Tracker changed from Bug to Support
  • Status changed from New to Resolved

fiemap is disabled by default precisely because there are a number of known bugs in the local filesystems across kernels that we can't do anything about. :(

Also available in: Atom PDF