Project

General

Profile

Actions

Bug #6789

closed

cannot remove the leader when there only are two monitors

Added by Loïc Dachary over 10 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Normal
Category:
Monitor
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
firefly,dumpling
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On Ubuntu precise with dumpling 0.67.4, create a new cluster with two monitors. ceph mon remove name_of_the_leader will succeed and add it back immediately afterwards. It was discussed today with joao who suggests looking into extra_probe_peers.

Actions #1

Updated by Joao Eduardo Luis about 10 years ago

This doesn't currently happen on latest. Haven't tested yet with latest dumpling and latest emperor.

Actions #2

Updated by Joao Eduardo Luis about 10 years ago

  • Status changed from New to In Progress

I was wrong. This does happen on current, and emperor, and dumpling.

The monitor has this features that allows him to attempt to join an existing quorum even if it the monitor itself is not in the monmap. However, this should only be allowed for fresh monitors. A monitor that has been marked (in its store) as having belonged to a quorum in the past should not be allowed to boot if it is not in the monmap, as that means it has been removed from the cluster. I'm currently building a patch for this.

Actions #3

Updated by Joao Eduardo Luis about 10 years ago

Also, it's relevant to mention that this does not happen only with the leader. Any monitor that is removed from the monmap will attempt to join any existing quorum upon boot.

This is also an issue with upstart scripts and any other tool that restarts a mon once it shuts down. Although there is a bug on the monitor that allows a removed monitor to boot and find its way into the quorum again, anything that keeps restarting dead services is an active participant in the daemon being restarted -- and any patch to the mon won't be able to fix that but only avoid the mon from being an active participant in the cluster.

Actions #5

Updated by Joao Eduardo Luis about 10 years ago

  • Status changed from In Progress to Fix Under Review
Actions #6

Updated by Loïc Dachary about 10 years ago

Cool :-)

Actions #7

Updated by Joao Eduardo Luis about 10 years ago

  • Assignee set to Joao Eduardo Luis
Actions #8

Updated by Sage Weil about 10 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #9

Updated by Sage Weil over 9 years ago

  • Status changed from Pending Backport to Resolved
  • Backport set to firefly,dumpling
Actions

Also available in: Atom PDF