Documentation #12620
note the behaviour that "ceph mon add <mon-id>" takes forever in a one-monitor cluster
0%
Description
in a single-monitor cluster, if user tries to add another monitor using (take the vstart cluster as an example)
$ CEPH_NUM_MDS=0 CEPH_NUM_OSD=0 CEPH_NUM_MON=1 ./vstart.sh -n -l -x $ ./ceph auth get mon. -o /tmp/keyring $ ./ceph mon getmap -o /tmp/monmap $ ./ceph-mon -i b --mkfs --monmap /tmp/monmap --keyring /tmp/keyring $ ./ceph mon add b 127.0.0.1:6790
the last command does not return at all. because after receiving the command, the existing monitor tries to form a quorum, but keeps sending probe messages to other monitors in vain. only after over n/2 peers replies it, it will be good. but it never gets the reply from the new monitor which is not yet started. so it is trapped in a dead loop until the new joiner is up and running, and is found by the existing monitor. and they finish the election and accept the proposal.
probably we can live with this, but instead, add a note to the document that the "ceph mon add" command will not return until the monitors form a quorum.
should also update document to transpose the following two commands
ceph mon add <mon-id> <ip>[:<port>]
and
ceph-mon -i {mon-id} --public-addr {ip:port}
see http://docs.ceph.com/docs/master/rados/operations/add-or-rm-mons/#adding-monitors
Related issues
Associated revisions
doc/rados/operations/add-or-rm-mons: simplify the steps to add a mon
this change removes the step to "ceph mon add" before starting a new
monitor. because the existing leader will start an election at seeing
the MMonJoin message sent by the new joiner, after the quorum is
archieved, the monmap will be updated with the new monitor.
so, "ceph mon add" is not necessary to add a new monitor.
moreover, this command will be blocked until a new quorum is formed,
and the proposed monmap is accepted. but in case of adding a monitor
to a single monitor cluster, the leader will wait until at least two
of the monitors reply to it. apparently, this does not happen unless
the new monitor starts. so from the user's point of view, this
command hangs until timesout, if he/she does not start the mon.b
beforehand. but this is an expected behaviour.
so, to avoid this confusion and simplify the steps to add a new
monitor. we'd better simply remove this "ceph mon add" step.
Fixes: #12620
Signed-off-by: Kefu Chai <kchai@redhat.com>
History
#1 Updated by Kefu Chai over 8 years ago
- Description updated (diff)
#2 Updated by Kefu Chai over 8 years ago
- Description updated (diff)
#3 Updated by Kefu Chai over 8 years ago
- Status changed from New to In Progress
#4 Updated by Kefu Chai over 8 years ago
- Status changed from In Progress to Resolved