Manager daemon x is unresponsive. No standby daemons available
The last time I looked this appeared to be the mgr monc failing to reconnect quickly enough to get its beacon through. Need to review the last few failures and confirm that is the case.
Note that this error is whitelisted in a few places.
#1 Updated by Greg Farnum over 1 year ago
- Status changed from Verified to In Progress
Sage believes this is due to high failure injections in the messenger in some of our testing, which makes it sometimes fail multiple times in a row until we exceed our timeout. He's putting a log whitelist in those yaml fragments.