Project

General

Profile

Actions

Bug #58107

closed

mon-stretch: old stretch_marked_down_mons leads to ceph unresponsive

Added by Kamoltat (Junior) Sirivadhna over 1 year ago. Updated over 1 year ago.

Status:
Closed
Priority:
High
Category:
Stretch Clusters
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

How to reproduce the issue

Set up:

mon.a (zone 1) rank=0
mon.b (zone 1) rank=1
mon.c (zone 2) rank=2
mon.d (zone 2) rank=3
mon.e (arbiter) rank=4

stretch_mode cluster with 2 zones 4 mons (2 each zones) and 4 OSDs (2 each zones).

shutdown zone 2 and wait til enter degraded stretch-mode
start zone 2
immediately shutdown zone1.

Result:

ceph becomes unresponsive

Explanation:

e0 quorum = {a, b, c, d, e} stretch_marked_down_mons = {} disallowed_leader {e}
e1 quorum = {a, b, e} stretch_marked_down_mons = {c, d} disallowed_leader {e}

mon.c starts back, up probe mon.b and gets map e1 (stretch_marked_down_mons = {c, d})
mon.d starts back, up probe mon.b and gets map e1 (stretch_marked_down_mons = {c, d})

we go into the function: Monitor::set_elector_disallowed_leaders() elector.disallowed_leaders = {c,d,e}

Within the same monmap we shutdown zone1

e1 quorum = { c, d, e} stretch_marked_down_mons = {c, d} disallowed_leader {e}

During an election every monitor is a disallowed_leader and no one will ever win an election. The only way we can get out of this is by starting back zone1.

The only way to clear monmap::stretch_marked_down_mons is through Monitor::trigger_healthy_stretch_mode(), which you need to be the leader to execute this function, and since we are in election when this happens, there is no chance we can go into trigger_healthy_stretch_mode().


Related issues 1 (0 open1 closed)

Related to RADOS - Bug #58049: mon:stretch-cluster: mishandled removed_ranks -> inconsistent peer_tracker leading to unable to form quorumResolvedKamoltat (Junior) Sirivadhna

Actions
Actions

Also available in: Atom PDF