Project

General

Profile

Actions

Bug #21986

closed

ceph the second mon can not join the quorum

Added by linghucong linghucong over 6 years ago. Updated over 6 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-disk
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

the second mon always can not join the quorum.

root@node-1151:/home/pkg/tmp# ceph -v
ceph version 13.0.0-2613-gce6ba63 (ce6ba63e143b194dc6f42f0f9620df8673161da7) mimic (dev)
root@node-1151:/home/pkg/tmp# ps -ef|grep ceph-mon
root 22837 1 62 19:20 ? 00:03:11 ceph-mon -i node-1151
root 22975 20949 0 19:25 pts/0 00:00:00 grep --color=auto ceph-mon

root@node-1152:/home/pkg/tmp# ceph -s
cluster:
id: 3edc30f3-2157-4251-b94c-2a81db839bc8
health: HEALTH_WARN
too many PGs per OSD (320 > max 300)
1/3 mons down, quorum node-1150,node-1152

services:
mon: 3 daemons, quorum node-1150,node-1152, out of quorum: node-1151
mgr: node-1150(active), standbys: node-1151, node-1152
osd: 3 osds: 3 up, 3 in
data:
pools: 5 pools, 320 pgs
objects: 16674 objects, 95901 MB
usage: 287 GB used, 2509 GB / 2797 GB avail
pgs: 320 active+clean

mon log:

2017-10-31 19:27:41.866 7fb72a771700 0 -- 10.11.1.151:6789/0 >> 10.11.1.150:6789/0 conn(0x559a7c1ab000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 610232 vs existing csq=610231 existing_state=STATE_STANDBY
2017-10-31 19:27:41.866 7fb72a771700 0 can't decode unknown message type 1537 MSG_AUTH=17
2017-10-31 19:27:41.866 7fb72a771700 0 -- 10.11.1.151:6789/0 >> 10.11.1.150:6789/0 conn(0x559a7c1a9800 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 610234 vs existing csq=610233 existing_state=STATE_STANDBY
2017-10-31 19:27:41.866 7fb72a771700 0 can't decode unknown message type 1537 MSG_AUTH=17
2017-10-31 19:27:41.866 7fb72a771700 0 -- 10.11.1.151:6789/0 >> 10.11.1.150:6789/0 conn(0x559a7c1a2800 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 610236 vs existing csq=610235 existing_state=STATE_STANDBY
2017-10-31 19:27:41.866 7fb72a771700 0 can't decode unknown message type 1537 MSG_AUTH=17
2017-10-31 19:27:41.870 7fb72a771700 0 -- 10.11.1.151:6789/0 >> 10.11.1.150:6789/0 conn(0x559a7c19e000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 610238 vs existing csq=610237 existing_state=STATE_STANDBY
2017-10-31 19:27:41.870 7fb72a771700 0 can't decode unknown message type 1537 MSG_AUTH=17
2017-10-31 19:27:41.870 7fb72a771700 0 -- 10.11.1.151:6789/0 >> 10.11.1.150:6789/0 conn(0x559a7c192800 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 610240 vs existing csq=610239 existing_state=STATE_STANDBY
2017-10-31 19:27:41.870 7fb72a771700 0 can't decode unknown message type 1537 MSG_AUTH=17
2017-10-31 19:27:41.870 7fb72a771700 0 -- 10.11.1.151:6789/0 >> 10.11.1.150:6789/0 conn(0x559a7c1ab000 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 610242 vs existing csq=610241 existing_state=STATE_STANDBY
2017-10-31 19:27:41.870 7fb72a771700 0 can't decode unknown message type 1537 MSG_AUTH=17
2017-10-31 19:27:41.870 7fb72a771700 0 -- 10.11.1.151:6789/0 >> 10.11.1.150:6789/0 conn(0x559a7c1a9800 :6789 s=STATE_ACCEPTING_WAIT_CONNECT_MSG_AUTH pgs=0 cs=0 l=0).handle_connect_msg accept connect_seq 610244 vs existing csq=610243 existing_state=STATE_STANDBY
2017-10-31 19:27:41.874 7fb72a771700 0 can't decode unknown message type 1537 MSG_AUTH=17


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #21770: ceph mon core dump when use ceph osd perf cmd.ResolvedJoao Eduardo Luis10/12/2017

Actions
Actions

Also available in: Atom PDF