Bug #11798
closedupstart: configuration is too generous on restarts
0%
Description
See https://bugzilla.redhat.com/show_bug.cgi?id=1210871 for the investigation that prompted this.
Our current upstart scripts are probably too generous about restarting processes. At the moment each daemon is configured to restart as long as it doesn't exceed 5 crashes in 30 seconds. The restart process on some of them can exceed 6 seconds (at least some of the time), and any of our daemons which are crashing that frequently are probably stuck on a disk state issue.
We need to run some tests to figure out more reasonable values and change them.
Updated by Sage Weil almost 9 years ago
- Status changed from New to Fix Under Review
Updated by Greg Farnum almost 9 years ago
- Status changed from Fix Under Review to Resolved
Merged in commit:172d3ac8744c876a0f6ed99f4d63d95ea899cf85 we do 3 restarts in 30 minutes on OSD, Mon, MDS.
Updated by Sage Weil over 8 years ago
https://github.com/ceph/ceph/pull/5930 (hammer backport)
Updated by Loïc Dachary over 8 years ago
- Status changed from Resolved to Pending Backport
- Backport set to hammer
Updated by Loïc Dachary over 8 years ago
- Status changed from Pending Backport to Resolved
Updated by Ken Dreyer over 8 years ago
- Backport changed from hammer to hammer, firefly
We're planning to ship this fix downstream in the RHCS 1.2 series - we might as well get it upstream in Firefly too.
Updated by Ken Dreyer over 8 years ago
- Status changed from Resolved to Pending Backport
Updated by Loïc Dachary over 8 years ago
- Status changed from Pending Backport to Resolved