Project

General

Profile

Actions

Feature #8195

open

shorten window of highest risk during recovery

Added by Alexandre Oliva about 10 years ago. Updated about 10 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

Say a 3-sized PG experienced failure of two OSDs, the second one failing when the first replacement was part-way through recovery, like:

1st: 0123456789
2nd: 012_______
3rd: __

AFAICT, backfilling of 3rd will start at 0 until it catches up with backfilling of 2nd, and then both will proceed concurrently.

This means objects 3 to 9 remain longer without a second replica, while objects 0 to 2, that already have two replicas in the cluster, are further replicated.

I believe it would be wiser to start backfilling 3rd (along with 2nd) at 3 all the way to 9, and then, once 2nd is done, backfilling of 3rd wraps around and finishes 0 to 2.

OSDs might fail and come back during multi-OSD backfilling. Maintaining per-OSD info on recovery windows to start/skip backfilling might make sense, but a much strategy would amount to resetting the backfill start/end point to the current backfill point every time an OSD needing backfilling joins the PG. This might go over already-backfilled portions of some OSDs more than once until all OSDs remain up throughout a complete cycle. Say, consider that 2nd fails after joint backfill of objects 3 and 4, and rejoins when 3rd has already got objects 5 and 6:

1st: 0123456789
2nd: 01234_____
3rd: 3456

In this simplified backfilling proposal, we'd set the begin/end point between objects 6 and 7, so that joint backfilling starts at 7, and both 2nd and 3rd will be regarded as fully-backfilled if both remain up after iterating over 7 to 9 and then 0 to 6. Objects that are already up-to-date (say 0 to 4 in 2nd and 3 to 6 in 3rd) will be quickly skipped, in the same way they are when backfilling an OSD rolled back to an old snapshot.

Actions

Also available in: Atom PDF