Project

General

Profile

Actions

Bug #55695

open

Shutting down a monitor forces Paxos to restart and sometimes disregard subsequent commands

Added by Kamoltat (Junior) Sirivadhna almost 2 years ago. Updated over 1 year ago.

Status:
Fix Under Review
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Problem:

mon.a
mon.b
mon.c
mon.d
mon.e

ceph -a stop mon.d
ceph mon remove d

.
.

mon.d is down but actually did not get removed and monmap still did not get updated.

Explanation:

We shut down mon.d, this blocks mon.a Paxos:: begin (assuming Paxos is updating), since mon.a (leader) sends out begin message (MMonPaxos::OP_BEGIN) to all peer monitors including mon.d since mon.get_quorum() still is not updated and still contain mon.d. Paxos will not proceed to the commit() phase since we did not get a reply message(MMonPaxos::OP_ACCEPT) from mon.d. The lease of one of the monitors will eventually expire and will call for an election, now, if the monmap proposal of ("mon remove", "name": "d") comes in before the election happens, it will get queued to pending_finisher and will eventually get discarded once the election has started. Result, in the remove command not taking effect.

However, if the election finishes before the monmap proposal, then we will be fine because mon.get_quorum() will get updated and we will not be sending (MMonPaxos::OP_BEGIN) to mon.d, hence will not be blocked and will continue to the commit phase, therefore, this problem is non-deterministic.

Shutting down a monitor kind of force Paxos to restart because in the Paxos::handle_accept (Accept Phase) we need all monitors in the quorum to reply back with an accept and this will definitely fail because we shutdown a monitor. Restarting a Paxos poses a risk of clearing a pending_proposal that is initiated by the client like removing a monitor.

Do we actually need all monitors in the Accept Phase or just > mon.get_quorum().size()/2?
https://github.com/ceph/ceph/blob/main/src/mon/Paxos.cc#L806-L812
Here, in the Prepare phase of Paxos, we only check for mon.monmap->size()/2 before beginning the process: https://github.com/ceph/ceph/blob/main/src/mon/Paxos.cc#L622-L624


Related issues 1 (0 open1 closed)

Related to RADOS - Bug #50089: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of monitors in the clusterResolvedKamoltat (Junior) Sirivadhna

Actions
Actions #2

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Subject changed from Removing 2 MONs before monmap update results in incorrect monmap rank to Shutdown before remove can cause Paxos to restart and clear pending_proposal for monmap change
Actions #3

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Description updated (diff)
Actions #4

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Description updated (diff)
Actions #5

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Subject changed from Shutdown before remove can cause Paxos to restart and clear pending_proposal for monmap change to Shutting down a monitor forces Paxos to restart and sometimes disregard subsequent commands
Actions #6

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Description updated (diff)
Actions #7

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Description updated (diff)
Actions #8

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Description updated (diff)
Actions #9

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Pull request ID set to 46740
Actions #10

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Priority changed from Normal to Urgent
Actions #11

Updated by Neha Ojha almost 2 years ago

  • Status changed from New to Fix Under Review
Actions #12

Updated by Kamoltat (Junior) Sirivadhna over 1 year ago

  • Related to Bug #50089: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of monitors in the cluster added
Actions #13

Updated by Kamoltat (Junior) Sirivadhna over 1 year ago

  • Priority changed from Urgent to Normal
Actions

Also available in: Atom PDF