Project

General

Profile

Bug #21147

Manager daemon x is unresponsive. No standby daemons available

Added by Sage Weil over 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/sage-2017-08-26_20:38:41-rados-luminous-distro-basic-smithi/1567938

The last time I looked this appeared to be the mgr monc failing to reconnect quickly enough to get its beacon through. Need to review the last few failures and confirm that is the case.

Note that this error is whitelisted in a few places.


Related issues

Copied to RADOS - Backport #22399: luminous: Manager daemon x is unresponsive. No standby daemons available Resolved

History

#1 Updated by Greg Farnum about 5 years ago

  • Status changed from 12 to In Progress

Sage believes this is due to high failure injections in the messenger in some of our testing, which makes it sometimes fail multiple times in a row until we exceed our timeout. He's putting a log whitelist in those yaml fragments.

#2 Updated by Sage Weil about 5 years ago

  • Status changed from In Progress to Fix Under Review

#3 Updated by Kefu Chai about 5 years ago

  • Status changed from Fix Under Review to Pending Backport

#4 Updated by Nathan Cutler almost 5 years ago

  • Copied to Backport #22399: luminous: Manager daemon x is unresponsive. No standby daemons available added

#5 Updated by Nathan Cutler almost 5 years ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF