Actions
Feature #1637
closedOSDs running full take down other OSDs
Status:
Duplicate
Priority:
High
Assignee:
-
Category:
OSD
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:
Description
this issue has a relation to #1636.
in my test setup of v0.36 when one OSD runs full it gets taken down.
this starts a chain reaction replicating its data to other OSDs which in turn run full, too.
finally the whole cluster is down.
my understanding is, that this behavior is ok for data-safety, but for high availability it's pretty bad.
one dying OSD should not have that impact on others or the whole cluster.
if one OSD is failing and there's not enough capacity to replicate all its data according to the crushmap, a warning would be fine, that it's nessessary to add more capacity. the cluster should stay available.
Updated by Sage Weil over 11 years ago
- Translation missing: en.field_position set to 4
Updated by Sage Weil over 11 years ago
- Translation missing: en.field_position deleted (
5) - Translation missing: en.field_position set to 4
Updated by Sage Weil over 11 years ago
- Translation missing: en.field_story_points set to 5
- Translation missing: en.field_position deleted (
6) - Translation missing: en.field_position set to 4
Updated by Sage Weil over 11 years ago
- Translation missing: en.field_position deleted (
32) - Translation missing: en.field_position set to 2
Updated by Sage Weil over 11 years ago
- Status changed from New to Duplicate
- Translation missing: en.field_position deleted (
3) - Translation missing: en.field_position set to 3
Actions