Project

General

Profile

Actions

Bug #12053

closed

ceph osd out will keep some pgs in degraded status forever.

Added by cory gu almost 9 years ago. Updated almost 9 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

take osd.0 as an example. it serves nine pgs in total. and serves three pgs as the last replica: saying acting osd sets [12,13,0], [8, 3, 0], [7, 9, 0].
Then ceph osd out 0. it triggered pg recovery.
finally those pgs with last replica in osd.0 will be kept in degraded status with two osds, saying [12,13], [8, 3], [7, 9].

It seems to be a crush bug. crush can't handle recover those pgs after several retries t choose a new osd for the last replica, unless the osd.0 is removed from crush map.

Actions #1

Updated by Samuel Just almost 9 years ago

  • Status changed from New to Rejected

You need to increase the number of retries with only three hosts.

Actions

Also available in: Atom PDF