Project

General

Profile

Actions

Bug #49810

open

rados/singleton: with msgr-failures/none MON_DOWN due to haven't formed initial quorum, EBUSY

Added by Neha Ojha about 3 years ago. Updated about 1 year ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2021-03-11T03:49:36.963223+0000 mon.a (mon.0) 39 : cluster [WRN] Health check failed: 1/3 mons down, quorum a,b (MON_DOWN)
2021-03-11T03:49:36.963223+0000 mon.a (mon.0) 39 : cluster [WRN] Health check failed: 1/3 mons down, quorum a,b (MON_DOWN)
2021-03-11T03:49:36.963223+0000 mon.a (mon.0) 39 : cluster [WRN] Health check failed: 1/3 mons down, quorum a,b (MON_DOWN)
2021-03-11T03:49:37.007501+0000 mon.a (mon.0) 48 : cluster [INF] Health check cleared: MON_DOWN (was: 1/3 mons down, quorum a,b)
2021-03-11T03:49:37.007501+0000 mon.a (mon.0) 48 : cluster [INF] Health check cleared: MON_DOWN (was: 1/3 mons down, quorum a,b)
2021-03-11T03:49:37.007501+0000 mon.a (mon.0) 48 : cluster [INF] Health check cleared: MON_DOWN (was: 1/3 mons down, quorum a,b)

2021-03-11T03:49:36.942+0000 7ff143e65700 10 mon.c@2(electing) e1 handle_auth_request con 0x560ff51a7c00 (start) method 2 payload 18
2021-03-11T03:49:36.942+0000 7ff143e65700 10 mon.c@2(electing) e1 handle_auth_request haven't formed initial quorum, EBUSY
2021-03-11T03:49:36.942+0000 7ff143e65700  1 --2- [v2:172.21.15.17:3302/0,v1:172.21.15.17:6791/0] >>  conn(0x560ff51a7c00 0x560ff51ed900 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2021-03-11T03:49:36.942+0000 7ff144e67700 10 mon.c@2(electing) e1 ms_handle_reset 0x560ff51a7c00 -
2021-03-11T03:49:36.955+0000 7ff143e65700  1 --2- [v2:172.21.15.17:3302/0,v1:172.21.15.17:6791/0] >>  conn(0x560ff5243000 0x560ff51ec500 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0).accept
2021-03-11T03:49:36.955+0000 7ff143e65700  1 --2- [v2:172.21.15.17:3302/0,v1:172.21.15.17:6791/0] >>  conn(0x560ff5243000 0x560ff51ec500 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0)._handle_peer_banner_payload supported=1 required=0
2021-03-11T03:49:36.955+0000 7ff143e65700 10 mon.c@2(electing) e1 handle_auth_request con 0x560ff5243000 (start) method 2 payload 18
2021-03-11T03:49:36.955+0000 7ff143e65700 10 mon.c@2(electing) e1 handle_auth_request haven't formed initial quorum, EBUSY
2021-03-11T03:49:36.955+0000 7ff143e65700  1 --2- [v2:172.21.15.17:3302/0,v1:172.21.15.17:6791/0] >>  conn(0x560ff5243000 0x560ff51ec500 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2021-03-11T03:49:36.955+0000 7ff143e65700  1 -- [v2:172.21.15.17:3302/0,v1:172.21.15.17:6791/0] reap_dead start
2021-03-11T03:49:36.955+0000 7ff144e67700 10 mon.c@2(electing) e1 ms_handle_reset 0x560ff5243000 -
2021-03-11T03:49:36.957+0000 7ff143e65700  1 --2- [v2:172.21.15.17:3302/0,v1:172.21.15.17:6791/0] >>  conn(0x560ff5243000 0x560ff51ec500 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0).accept
2021-03-11T03:49:36.957+0000 7ff143e65700  1 --2- [v2:172.21.15.17:3302/0,v1:172.21.15.17:6791/0] >>  conn(0x560ff5243000 0x560ff51ec500 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0)._handle_peer_banner_payload supported=1 required=0
2021-03-11T03:49:36.957+0000 7ff143e65700 10 mon.c@2(electing) e1 handle_auth_request con 0x560ff5243000 (start) method 2 payload 18
2021-03-11T03:49:36.957+0000 7ff143e65700 10 mon.c@2(electing) e1 handle_auth_request haven't formed initial quorum, EBUSY
2021-03-11T03:49:36.957+0000 7ff143e65700  1 --2- [v2:172.21.15.17:3302/0,v1:172.21.15.17:6791/0] >>  conn(0x560ff5243000 0x560ff51ec500 secure :-1 s=AUTH_ACCEPTING pgs=0 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2021-03-11T03:49:36.957+0000 7ff144e67700 10 mon.c@2(electing) e1 ms_handle_reset 0x560ff5243000 -
2021-03-11T03:49:36.964+0000 7ff143e65700  1 --2- [v2:172.21.15.17:3302/0,v1:172.21.15.17:6791/0] >>  conn(0x560ff51a7400 0x560ff51eca00 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0).accept
2021-03-11T03:49:36.964+0000 7ff143e65700  1 --2- [v2:172.21.15.17:3302/0,v1:172.21.15.17:6791/0] >>  conn(0x560ff51a7400 0x560ff51eca00 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0)._handle_peer_banner_payload supported=1 required=0
2021-03-11T03:49:36.964+0000 7ff143e65700 10 mon.c@2(electing) e1 handle_auth_request con 0x560ff51a7400 (start) method 2 payload 18
2021-03-11T03:49:36.964+0000 7ff143e65700 10 mon.c@2(electing) e1 handle_auth_request haven't formed initial quorum, EBUSY

rados/singleton/{all/mon-config mon_election/classic msgr-failures/none msgr/async objectstore/bluestore-hybrid rados supported-random-distro$/{rhel_8}}

/a/sseshasa-2021-03-10_16:16:17-rados-wip-sseshasa-testing-2021-03-10-1933-distro-basic-smithi/5953362/

Actions #1

Updated by jianwei zhang about 1 year ago

somebody know why?

Actions #2

Updated by Radoslaw Zarzynski about 1 year ago

  • Status changed from New to Need More Info

There was re-occurrence recorded over 2 years, so would need to wait for one to get logs.

Actions

Also available in: Atom PDF