Actions
Feature #1885
closedidentify top 10 expected failures and process to diagnose
% Done:
0%
Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:
Description
- peering failures
- unfound objects
Updated by Sage Weil over 12 years ago
- Translation missing: en.field_position set to 4
Updated by Sage Weil over 12 years ago
- Translation missing: en.field_position deleted (
22) - Translation missing: en.field_position set to 10
Updated by Sage Weil over 12 years ago
- Target version changed from v0.41 to v0.42
- Translation missing: en.field_position deleted (
17) - Translation missing: en.field_position set to 1
Updated by Anonymous over 12 years ago
OSD:
- cascading failures
- single OSD failure
- failure to complete peering/recovery
- unfound objects after recovery
- full
- slow
- fails to respond to some request
- failure
- failure
- stops forwarding requests
Updated by Anonymous over 12 years ago
Additional issues from Carl's list:
- RGW request timeouts
- OSD file system timeouts
- OSD that is "down" but still "in"
- degraded placement groups
Updated by Greg Farnum over 12 years ago
Mark Kampe wrote:
Additional issues from Carl's list:
- RGW request timeouts
That's a symptom, not a cause...
- OSD file system timeouts
What timeouts? We have a few that cause suicides but I suspect he just means OSDs being slow in the filesystem.
- OSD that is "down" but still "in"
- degraded placement groups
I'm not sure what either of these are about. Both are revealed with "ceph -s" (more detail under "ceph osd dump" and "ceph pg dump"), and neither are problems in and of themselves.
Updated by Sage Weil over 12 years ago
- Status changed from New to Resolved
- Translation missing: en.field_position deleted (
16) - Translation missing: en.field_position set to 16
Actions