Project

General

Profile

Documentation #12620

note the behaviour that "ceph mon add <mon-id>" takes forever in a one-monitor cluster

Added by Kefu Chai over 8 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Monitor
Target version:
-
% Done:

0%

Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

in a single-monitor cluster, if user tries to add another monitor using (take the vstart cluster as an example)

$ CEPH_NUM_MDS=0 CEPH_NUM_OSD=0 CEPH_NUM_MON=1 ./vstart.sh -n -l -x
$ ./ceph auth get mon. -o /tmp/keyring
$ ./ceph mon getmap -o /tmp/monmap
$ ./ceph-mon -i b --mkfs --monmap /tmp/monmap --keyring /tmp/keyring
$ ./ceph mon add b 127.0.0.1:6790

the last command does not return at all. because after receiving the command, the existing monitor tries to form a quorum, but keeps sending probe messages to other monitors in vain. only after over n/2 peers replies it, it will be good. but it never gets the reply from the new monitor which is not yet started. so it is trapped in a dead loop until the new joiner is up and running, and is found by the existing monitor. and they finish the election and accept the proposal.

probably we can live with this, but instead, add a note to the document that the "ceph mon add" command will not return until the monitors form a quorum.

should also update document to transpose the following two commands

 ceph mon add <mon-id> <ip>[:<port>]

and
 ceph-mon -i {mon-id} --public-addr {ip:port}

see http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/#adding-monitors


Related issues

Copied from Ceph - Bug #12569: "ceph mon add <mon-id>" takes forever in a one-monitor cluster Won't Fix 08/03/2015

Associated revisions

Revision b199ac6c (diff)
Added by Kefu Chai over 8 years ago

doc/rados/operations/add-or-rm-mons: simplify the steps to add a mon

this change removes the step to "ceph mon add" before starting a new
monitor. because the existing leader will start an election at seeing
the MMonJoin message sent by the new joiner, after the quorum is
archieved, the monmap will be updated with the new monitor.
so, "ceph mon add" is not necessary to add a new monitor.
moreover, this command will be blocked until a new quorum is formed,
and the proposed monmap is accepted. but in case of adding a monitor
to a single monitor cluster, the leader will wait until at least two
of the monitors reply to it. apparently, this does not happen unless
the new monitor starts. so from the user's point of view, this
command hangs until timesout, if he/she does not start the mon.b
beforehand. but this is an expected behaviour.

so, to avoid this confusion and simplify the steps to add a new
monitor. we'd better simply remove this "ceph mon add" step.

Fixes: #12620
Signed-off-by: Kefu Chai <>

History

#1 Updated by Kefu Chai over 8 years ago

  • Description updated (diff)

#2 Updated by Kefu Chai over 8 years ago

  • Description updated (diff)

#3 Updated by Kefu Chai over 8 years ago

  • Status changed from New to In Progress

#4 Updated by Kefu Chai over 8 years ago

  • Status changed from In Progress to Resolved

Also available in: Atom PDF