Actions
Bug #40620
closedExplicitly requested repair of an inconsistent PG cannot be scheduled timely on a OSD with ongoing recovery
Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Since osd_scrub_during_recovery=false is used as default, when a OSD has some recovering PG, it will not schedule any new scrub including explicitly request repair. Thus inconsistent data cannot be fixed in time, which is not good for data safety.
The proposal is that we introduce a new config option osd_repair_during_recovery, whose default value is false:- When osd_scrub_during_recovery is true, ignore osd_repair_during_recovery (no behavior change)
- When osd_scrub_during_recovery is false and osd_repair_during_recovery is false, no behavior change
- When osd_scrub_during_recovery is false and osd_repair_during_recovery is true, we would allow `OSD::sched_scrub()` to schedule explicitly request repair (scrubber.must_repair=true)
Actions