Bug #9623
closedOn cluster with 3 mons, stopping 2 mons made cluster in-accessible, with IO's hung/pause
0%
Description
Cluster with "n" number of monitor nodes, will be in-accessible if "n-1" number of monitors are down.
Its been observed on cluster with 3 monitor nodes using 0.84 CEPH version.
Steps to reproduce:
Test 1: By stopping monitor service
1> Create cluster with 3 monitor nodes
2> Start IO onto cluster
3> Stop monitor service on two of the monitor nodes (sudo stop ceph-mon-all)
Its observed that,
a> Cluster became in-accessible
b> IO's are in paused state
c> Mon status of 3rd node, is in "probing" state forever
test@rack6-client-4:~$ sudo ceph --admin-daemon /var/run/ceph/ceph-mon.rack6-client-5.asok mon_status
{ "name": "rack6-client-5",
"rank": 1,
"state": "probing",
"election_epoch": 60,
"quorum": [],
"outside_quorum": [
"rack6-client-5"],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "fe2afe2d-1096-4c3e-a91c-73ccfad84851",
"modified": "0.000000",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "rack6-client-4",
"addr": "10.242.43.105:6789\/0"},
{ "rank": 1,
"name": "rack6-client-5",
"addr": "10.242.43.106:6789\/0"},
{ "rank": 2,
"name": "rack6-client-6",
"addr": "10.242.43.107:6789\/0"}]}}
test@rack6-client-4:~$
Test 2: By exiting from quorum
1> Create cluster with 3 monitor nodes
2> Start IO onto cluster
3> Stop monitor service on two of the monitor nodes (sudo ceph --admin-daemon /var/run/ceph/ceph-mon.rack6-client-5.asok quorum exit)
Its observed that,
a> Cluster became in-accessible
b> IO's are in paused state
c> Mon status of 3rd node, is in "electing" state forever
test@rack6-client-4:~$ sudo ceph --admin-daemon /var/run/ceph/ceph-mon.rack6-client-5.asok mon_status
{ "name": "rack6-client-5",
"rank": 1,
"state": "electing",
"election_epoch": 55,
"quorum": [],
"outside_quorum": [],
"extra_probe_peers": [],
"sync_provider": [],
"monmap": { "epoch": 1,
"fsid": "fe2afe2d-1096-4c3e-a91c-73ccfad84851",
"modified": "0.000000",
"created": "0.000000",
"mons": [
{ "rank": 0,
"name": "rack6-client-4",
"addr": "10.242.43.105:6789\/0"},
{ "rank": 1,
"name": "rack6-client-5",
"addr": "10.242.43.106:6789\/0"},
{ "rank": 2,
"name": "rack6-client-6",
"addr": "10.242.43.107:6789\/0"}]}}
test@rack6-client-4:~$