Project

General

Profile

Feature #7243

Report something sensible for out-of-quorum clusters

Added by John Spray over 7 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Backend (services)
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:

Description

Currently, if there is quorum but one or more mons is out of it, we report that sanely. However, when a cluster loses quorum it stops responding to our requests for status, and we fall back to reporting that we lost contact with the cluster (which is only true in the technically-correct sense).

Calamari 1.0 partway dealt with this by attempting to open a TCP connection to any mon which appeared to be out of quorum, do indicate that it was 'up' but not 'in'. However, it would still fall over if there was no quorum to talk to at all.

We should be able to do a better job for 2.0 with ServerMonitor's knowledge of a mon service being running or not running at some point in the recent past.

Associated revisions

Revision 260a6757 (diff)
Added by John Spray over 7 years ago

cthulhu/rest/salt: Enhanced mon status

Gather the local mon_status from mons, and use it to
give useful information about the status of the cluster
when the mons are out of quorum.

Fixes: #7243

Conflicts:
rest-api/calamari_rest/views/v2.py
tests/test_rest_api.py

History

#1 Updated by John Spray over 7 years ago

  • translation missing: en.field_story_points set to 5.0

#2 Updated by John Spray over 7 years ago

  • Target version changed from v1.2 Backlog to v1.2-dev4

#3 Updated by John Spray over 7 years ago

  • Assignee set to John Spray

#4 Updated by John Spray over 7 years ago

  • Status changed from New to In Progress

#6 Updated by John Spray over 7 years ago

  • Status changed from In Progress to Fix Under Review

#7 Updated by John Spray over 7 years ago

  • Status changed from Fix Under Review to Resolved

56dfc8d4080e20f850715c94ffd1408608a80350

Also available in: Atom PDF