Project

General

Profile

Actions

Feature #44107

closed

mon: produce stable election results when netsplits and other errors happen

Added by Greg Farnum about 4 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus
Reviewed:
Affected Versions:
Component(RADOS):
Monitor
Pull request ID:

Description

Right now, in netsplits and similar error conditions the monitors do not produce a stable quorum: whichever monitors are excluded will prompt continuous elections by Proposing to whatever peers they can reach.

To accomplish this, add heartbeating between the monitor daemons, use that to generate connection liveness and reliability scores, and use those scores as input to an election algorithm.

https://github.com/ceph/ceph/pull/32336


Related issues 1 (1 open0 closed)

Blocks RADOS - Feature #44108: mon: osd: handle 2-(main-)site stretch clusters explicitly, so no admin intervention is needed when a DC diesIn ProgressGreg Farnum

Actions
Actions

Also available in: Atom PDF