Project

General

Profile

Actions

Bug #11798

closed

upstart: configuration is too generous on restarts

Added by Greg Farnum almost 9 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
hammer, firefly
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

See https://bugzilla.redhat.com/show_bug.cgi?id=1210871 for the investigation that prompted this.

Our current upstart scripts are probably too generous about restarting processes. At the moment each daemon is configured to restart as long as it doesn't exceed 5 crashes in 30 seconds. The restart process on some of them can exceed 6 seconds (at least some of the time), and any of our daemons which are crashing that frequently are probably stuck on a disk state issue.

We need to run some tests to figure out more reasonable values and change them.


Related issues 2 (0 open2 closed)

Copied to devops - Backport #13168: upstart: configuration is too generous on restartsResolvedGreg Farnum05/28/2015Actions
Copied to devops - Backport #13091: upstart: configuration is too generous on restartsResolvedSage Weil05/28/2015Actions
Actions #1

Updated by Sage Weil almost 9 years ago

how about 5 restarts in 10 minutes?

Actions #2

Updated by Sage Weil almost 9 years ago

  • Status changed from New to Fix Under Review
Actions #3

Updated by Sage Weil almost 9 years ago

  • Assignee set to Sage Weil
Actions #4

Updated by Greg Farnum almost 9 years ago

  • Status changed from Fix Under Review to Resolved

Merged in commit:172d3ac8744c876a0f6ed99f4d63d95ea899cf85 we do 3 restarts in 30 minutes on OSD, Mon, MDS.

Actions #6

Updated by Loïc Dachary over 8 years ago

  • Status changed from Resolved to Pending Backport
  • Backport set to hammer
Actions #7

Updated by Loïc Dachary over 8 years ago

  • Status changed from Pending Backport to Resolved
Actions #8

Updated by Ken Dreyer over 8 years ago

  • Backport changed from hammer to hammer, firefly

We're planning to ship this fix downstream in the RHCS 1.2 series - we might as well get it upstream in Firefly too.

Actions #9

Updated by Ken Dreyer over 8 years ago

  • Status changed from Resolved to Pending Backport
Actions #10

Updated by Nathan Cutler over 8 years ago

  • Project changed from Ceph to devops
Actions #11

Updated by Loïc Dachary over 8 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF