Project

General

Profile

Actions

Support #64378

open

Slow / Single backfilling on Reef (18.2.1-pve2)

Added by Pivert Dubuisson 3 months ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Pull request ID:

Description

Hi,

Despite the :

ceph tell 'osd.*' injectargs '--osd-max-backfills 16'
ceph tell 'osd.*' injectargs '--osd-recovery-max-active 4'

I reweighted several OSDs so I expect several backfills on at least 2 OSDs.
I'm still stuck with maximum one backfilling at a time.
Why ?

Setup is 3 nodes with 3 fast SSDs (WD_Black SN850X) with full mesh 1GB full duplex network (with frr), and recovery is typically between 5 and 20MB/s probably because it's limited to a single PG backfill at a time.

How can I parallelize backfilling on Reef ?

root@pve1:~# ceph status
  cluster:
    id:     e7628d51-32b5-4f5c-8eec-1cafb41ead74
    health: HEALTH_WARN
            Degraded data redundancy: 4510616/37577132 objects degraded (12.004%), 39 pgs degraded, 42 pgs undersized
            101 pgs not deep-scrubbed in time
            77 pgs not scrubbed in time

  services:
    mon: 3 daemons, quorum pve3,pve2,pve1 (age 16h)
    mgr: pve1(active, since 19h), standbys: pve3, pve2
    mds: 1/1 daemons up, 2 standby
    osd: 5 osds: 4 up (since 3h), 3 in (since 3h); 64 remapped pgs

  data:
    volumes: 1/1 healthy
    pools:   11 pools, 179 pgs
    objects: 12.55M objects, 1.2 TiB
    usage:   3.1 TiB used, 4.2 TiB / 7.3 TiB avail
    pgs:     4510616/37577132 objects degraded (12.004%)
             6561953/37577132 objects misplaced (17.463%)
             115 active+clean
             38  active+undersized+degraded+remapped+backfill_wait
             21  active+remapped+backfill_wait
             3   active+undersized+remapped+backfill_wait
             1   active+clean+remapped
             1   active+undersized+degraded+remapped+backfilling

  io:
    client:   6.6 KiB/s rd, 2.6 MiB/s wr, 10 op/s rd, 241 op/s wr
    recovery: 14 MiB/s, 3 objects/s

Actions

Also available in: Atom PDF