Project

General

Profile

Actions

Fix #6262

open

toofull osd prevents backfilling of other pg replicas

Added by Alexandre Oliva over 10 years ago. Updated over 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Say a pg is to be 4-way replicated across osds [0,1,2,3].

AFAICT, if any of the osds 0, 1 or 2 hit the toofull threshold before backfilling that osd completes, the pg will remain stuck in backfill_toofull, instead of putting the full osd aside and starting the backfilling of the remaining osds, which would enable further progress and protect data integrity should any of the osds holding the pg fail.

AFAICT the only way to avoid waiting for the toofull osd to free up space and complete its own backfilling before backfilling the subsequent osds in the replication set is to bring the too-full osd down, which prevents it from participating in recovery of other pgs and even from freeing up space as it becomes available. Plus, if it remains down for long enough that it becomes out, it will trigger additional recovery, that will slow things down and fill osds up further.

Actions #1

Updated by Samuel Just over 10 years ago

  • Tracker changed from Bug to Fix
Actions #2

Updated by Patrick Donnelly over 5 years ago

  • Project changed from Ceph to RADOS
  • Component(RADOS) OSD added
Actions

Also available in: Atom PDF