Feature #11017: Improve scrubbing throutput - Ceph - Ceph

Actions

Copy link

Feature #11017

open

Improve scrubbing throutput

Added by Guang Yang about 9 years ago. Updated about 9 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

OSD

Target version:

% Done:

Source:

other

Tags:

Backport:

Reviewed:

Affected Versions:

Pull request ID:

Description

Every OSD has a configurable scrubbing slot, ideally the slot should be taken with an active scrubbing if there are PGs in the queue for being scrubbing. In our cluster, we found that around a half of the slots are idle even there are PGs in the queue for scrubbing. More details:

We have 540 OSDs, the data pool is EC with 8 + 3 = 11 replicas. Ideally we should have 540/11 ~ 50 active scrubbings, however, we only have 20 at a maximum with sometime monitoring, even there are PGs in the queue.

Configuration:
"osd_max_scrubs": "1"

-bash-4.1$ sudo ceph -s
    cluster 035b9c00-3fd0-4123-a92f-778ce59a426e
     health HEALTH_OK
     monmap e2: 3 mons at {mon01c003=10.214.146.208:6789/0,mon02c003=10.214.147.130:6789/0,mon03c003=10.214.147.80:6789/0}, election epoch 48, quorum 0,1,2 mon01c003,mon03c003,mon02c003
     osdmap e5568: 540 osds: 540 up, 540 in
      pgmap v10510804: 11424 pgs, 9 pools, 1429 TB data, 853 Mobjects
            2057 TB used, 883 TB / 2941 TB avail
                  20 active+clean+scrubbing+deep
               11404 active+clean

I think it makes sense to do the scheduling to potentially maximum the throughput (in terms of the number of active scrubbings).

Related issues 1 (1 open — 0 closed)

Actions

Copy link

Updated by Xinze Chi about 9 years ago

Are that some of the pgs interval of scrub small than osd_scrub_min_interval, so it reject it?

Actions

Copy link

Updated by Guang Yang about 9 years ago

Xinze Chi wrote:

Are that some of the pgs interval of scrub small than osd_scrub_min_interval, so it reject it?

No. I think it is because there is not a PG that its OSDs have slots, for example, there is no scrub at osd.X, however, when it tries to kick off scrubbing, all its peers's slots have been taken..

Actions

Copy link