Project

General

Profile

Bug #55695

Shutting down a monitor forces Paxos to restart and sometimes disregard subsequent commands

Added by Kamoltat (Junior) Sirivadhna 7 months ago. Updated 6 months ago.

Status:
Fix Under Review
Priority:
Urgent
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Problem:

mon.a
mon.b
mon.c
mon.d
mon.e

ceph -a stop mon.d
ceph mon remove d

.
.

mon.d is down but actually did not get removed and monmap still did not get updated.

Explanation:

We shut down mon.d, this blocks mon.a Paxos:: begin (assuming Paxos is updating), since mon.a (leader) sends out begin message (MMonPaxos::OP_BEGIN) to all peer monitors including mon.d since mon.get_quorum() still is not updated and still contain mon.d. Paxos will not proceed to the commit() phase since we did not get a reply message(MMonPaxos::OP_ACCEPT) from mon.d. The lease of one of the monitors will eventually expire and will call for an election, now, if the monmap proposal of ("mon remove", "name": "d") comes in before the election happens, it will get queued to pending_finisher and will eventually get discarded once the election has started. Result, in the remove command not taking effect.

However, if the election finishes before the monmap proposal, then we will be fine because mon.get_quorum() will get updated and we will not be sending (MMonPaxos::OP_BEGIN) to mon.d, hence will not be blocked and will continue to the commit phase, therefore, this problem is non-deterministic.

Shutting down a monitor kind of force Paxos to restart because in the Paxos::handle_accept (Accept Phase) we need all monitors in the quorum to reply back with an accept and this will definitely fail because we shutdown a monitor. Restarting a Paxos poses a risk of clearing a pending_proposal that is initiated by the client like removing a monitor.

Do we actually need all monitors in the Accept Phase or just > mon.get_quorum().size()/2?
https://github.com/ceph/ceph/blob/main/src/mon/Paxos.cc#L806-L812
Here, in the Prepare phase of Paxos, we only check for mon.monmap->size()/2 before beginning the process: https://github.com/ceph/ceph/blob/main/src/mon/Paxos.cc#L622-L624


Related issues

Related to RADOS - Bug #50089: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of monitors in the cluster Resolved

History

#2 Updated by Kamoltat (Junior) Sirivadhna 6 months ago

  • Subject changed from Removing 2 MONs before monmap update results in incorrect monmap rank to Shutdown before remove can cause Paxos to restart and clear pending_proposal for monmap change

#3 Updated by Kamoltat (Junior) Sirivadhna 6 months ago

  • Description updated (diff)

#4 Updated by Kamoltat (Junior) Sirivadhna 6 months ago

  • Description updated (diff)

#5 Updated by Kamoltat (Junior) Sirivadhna 6 months ago

  • Subject changed from Shutdown before remove can cause Paxos to restart and clear pending_proposal for monmap change to Shutting down a monitor forces Paxos to restart and sometimes disregard subsequent commands

#6 Updated by Kamoltat (Junior) Sirivadhna 6 months ago

  • Description updated (diff)

#7 Updated by Kamoltat (Junior) Sirivadhna 6 months ago

  • Description updated (diff)

#8 Updated by Kamoltat (Junior) Sirivadhna 6 months ago

  • Description updated (diff)

#9 Updated by Kamoltat (Junior) Sirivadhna 6 months ago

  • Pull request ID set to 46740

#10 Updated by Kamoltat (Junior) Sirivadhna 6 months ago

  • Priority changed from Normal to Urgent

#11 Updated by Neha Ojha 6 months ago

  • Status changed from New to Fix Under Review

#12 Updated by Kamoltat (Junior) Sirivadhna 5 months ago

  • Related to Bug #50089: mon/MonMap.h: FAILED ceph_assert(m < ranks.size()) when reducing number of monitors in the cluster added

Also available in: Atom PDF