Project

General

Profile

Actions

Bug #14263

closed

mon:Re join the Monitor alone to form a cluster when it's rank=0

Added by huanwen ren over 8 years ago. Updated over 8 years ago.

Status:
Rejected
Priority:
High
Category:
Monitor
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Monitor environment:
ceph240\ceph243\ceph244

[root@ceph240 ~]# ceph mon_status -f json-pretty

{
    "name": "ceph244",
    "rank": 2,
    "state": "peon",
    "election_epoch": 7402,
    "quorum": [
        0,
        1,
        2
    ],
    "outside_quorum": [],
    "extra_probe_peers": [],
    "sync_provider": [],
    "monmap": {
        "epoch": 106,
        "fsid": "bd9e23c5-da96-4908-a335-455445480e54",
        "modified": "2016-01-05 19:22:31.516818",
        "created": "0.000000",
        "mons": [
            {
                "rank": 0,
                "name": "ceph240",
                "addr": "100.100.100.240:6789\/0" 
            },
            {
                "rank": 1,
                "name": "ceph243",
                "addr": "100.100.100.243:6789\/0" 
            },
            {
                "rank": 2,
                "name": "ceph244",
                "addr": "100.100.100.244:6789\/0" 
            }
        ]
    }
}

Setp 1:
Delete ceph240, its rank = 0
ceph mon remove ceph240

[root@ceph240 ~]# ceph mon_status -f json-pretty

{
    "name": "ceph244",
    "rank": 1,
    "state": "peon",
    "election_epoch": 7404,
    "quorum": [
        0,
        1
    ],
    "outside_quorum": [],
    "extra_probe_peers": [],
    "sync_provider": [],
    "monmap": {
        "epoch": 107,
        "fsid": "bd9e23c5-da96-4908-a335-455445480e54",
        "modified": "2016-01-06 16:14:59.740930",
        "created": "0.000000",
        "mons": [
            {
                "rank": 0,
                "name": "ceph243",
                "addr": "100.100.100.243:6789\/0" 
            },
            {
                "rank": 1,
                "name": "ceph244",
                "addr": "100.100.100.244:6789\/0" 
            }
        ]
    }
}

Setp 2:
Gets the key and map information of the cluster
Command
(1)ceph auth get mon. -o /tmp/key-filename
(2)ceph mon getmap -o /tmp/map-filename
(3)rm -rf /var/lib/ceph/mon/ceph-ceph240/*
(4)ceph-mon -i ceph240 --mkfs --monmap /tmp/map-filename --keyring /tmp/key-filename
(5)ceph mon add ceph240 100.100.100.240:6789
(6)ceph.conf'info:

*note:Mon_initial_members must be equal to ceph240, otherwise there will not be a description of the problem*

[root@ceph240 ~]# more /etc/ceph/ceph.conf 
[global]
fsid = bd9e23c5-da96-4908-a335-455445480e54
public_network = 100.100.100.0/24
cluster_network = 111.111.111.0/24
max_core_file_size = unlimited
*mon_initial_members = ceph240*
mon_host = 100.100.100.240:6789,100.100.100.243:6789,100.100.100.244:6789,
osd_journal_size = 10240
osd_pool_default_size = 3
debug_osd=20/20
debug_mon = 20/20
debug_paxos = 20/20

[mon.ceph240]
host = ceph240

[client]
debug_client = 0/0
debug_objecter = 0/0
debug_rbd      = 0/0 
debug_rados    = 0/0
log file=/var/log/ceph/rbd.log

Setp 3:
See if cluster ceph240 joined successfully

*join successfully*

[root@ceph240 ~]# ceph mon_status -f json-pretty
{
    "name": "ceph244",
    "rank": 2,
    "state": "peon",
    "election_epoch": 7406,
    "quorum": [
        1,
        2
    ],
    "outside_quorum": [],
    "extra_probe_peers": [],
    "sync_provider": [],
    "monmap": {
        "epoch": 108,
        "fsid": "bd9e23c5-da96-4908-a335-455445480e54",
        "modified": "2016-01-06 16:33:21.905559",
        "created": "0.000000",
        "mons": [
            {
                "rank": 0,
                "name": "ceph240",
                "addr": "100.100.100.240:6789\/0" 
            },
            {
                "rank": 1,
                "name": "ceph243",
                "addr": "100.100.100.243:6789\/0" 
            },
            {
                "rank": 2,
                "name": "ceph244",
                "addr": "100.100.100.244:6789\/0" 
            }
        ]
    }
}

Setp 4:
start ceph240
Command:
service ceph start mon.ceph240

Use cluster command queries appear the following error:

[root@ceph240 ~]# ceph mon_status -f json-pretty
2016-01-06 16:35:18.756279 7fd957313700  0 librados: client.admin authentication error (1) Operation not permitted
Error connecting to cluster: PermissionError

ceph240 log:
2016-01-06 16:34:32.192786 7fb861dbc880  1 mon.ceph240@0(probing) e0 win_standalone_election
2016-01-06 16:34:32.192801 7fb861dbc880  1 mon.ceph240@0(probing).elector(1) init, last seen epoch 1
2016-01-06 16:34:32.192806 7fb861dbc880 10 mon.ceph240@0(probing).elector(1) bump_epoch 1 to 2
2016-01-06 16:34:32.193464 7fb861dbc880 10 mon.ceph240@0(probing) e0 join_election

Cause analysis:
ceph240 in the restart will add a separate cluster,when mon_initial_members equal to ceph240 and ceph240's rank equal to 0

Monitor::bootstrap()
{
  ...... 
 // singleton monitor?
  if (monmap->size() == 1 && rank == 0) {// only mon
    win_standalone_election();
    return;
  }
  ......

Actions

Also available in: Atom PDF