Actions
Feature #11017
openImprove scrubbing throutput
Status:
New
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:
Description
Every OSD has a configurable scrubbing slot, ideally the slot should be taken with an active scrubbing if there are PGs in the queue for being scrubbing. In our cluster, we found that around a half of the slots are idle even there are PGs in the queue for scrubbing. More details:
We have 540 OSDs, the data pool is EC with 8 + 3 = 11 replicas. Ideally we should have 540/11 ~ 50 active scrubbings, however, we only have 20 at a maximum with sometime monitoring, even there are PGs in the queue.
Configuration:
"osd_max_scrubs": "1"
-bash-4.1$ sudo ceph -s cluster 035b9c00-3fd0-4123-a92f-778ce59a426e health HEALTH_OK monmap e2: 3 mons at {mon01c003=10.214.146.208:6789/0,mon02c003=10.214.147.130:6789/0,mon03c003=10.214.147.80:6789/0}, election epoch 48, quorum 0,1,2 mon01c003,mon03c003,mon02c003 osdmap e5568: 540 osds: 540 up, 540 in pgmap v10510804: 11424 pgs, 9 pools, 1429 TB data, 853 Mobjects 2057 TB used, 883 TB / 2941 TB avail 20 active+clean+scrubbing+deep 11404 active+clean
I think it makes sense to do the scheduling to potentially maximum the throughput (in terms of the number of active scrubbings).
Actions