Bug #45441: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log' - RADOS - Ceph

Actions

Copy link

Bug #45441

closed

rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'

Added by Brad Hubbard almost 4 years ago. Updated about 3 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Neha Ojha

Category:

Target version:

% Done:

Source:

Q/A

Tags:

Backport:

pacific,octopus

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v16.0.0

ceph-qa-suite:

Component(RADOS):

Pull request ID:

39586

Crash signature (v1):

Crash signature (v2):

Description

/a/yuriw-2020-05-05_15:20:13-rados-wip-yuri8-testing-2020-05-04-2117-octopus-distro-basic-smithi/5024888

mon.0 = a
mon.1 = b
mon.2 = c

monmap e1: 3 mons at {a=[v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0],b=[v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0],c=[v2:172.21.15.133:3302/0,v1:172.21.15.133:6791/0]}

From the log of 'a'

Look for warning.

2020-05-06T02:34:28.022+0000 7fe456d4d700  0 log_channel(cluster) log [WRN] : Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)

Look for previous reference to mon.1 (b)

2020-05-06T02:34:23.019+0000 7fe456d4d700  5 mon.a@0(electing).elector(7) handle_propose from mon.1
2020-05-06T02:34:23.019+0000 7fe456d4d700 10 mon.a@0(electing).elector(7) handle_propose required features 2449958747315978244 mon_feature_t([kraken,luminous,mimic,osdmap-prune,nautilus,octopus]), peer features 4540138292836696063 mon_feature_t([kraken,luminous,mimic,osdmap-prune,nautilus,octopus])
2020-05-06T02:34:23.019+0000 7fe45ad55700  1 -- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >> [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] conn(0x5605838fcc00 msgr2=0x560582736880 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_bulk peer close file descriptor 34
2020-05-06T02:34:23.019+0000 7fe45ad55700  1 -- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >> [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] conn(0x5605838fcc00 msgr2=0x560582736880 secure :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until read failed
2020-05-06T02:34:23.019+0000 7fe45ad55700  1 --2- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >> [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] conn(0x5605838fcc00 0x560582736880 secure :-1 s=READY pgs=4 cs=0 l=0 rx=0x5605834560f0 tx=0x56058343d920).handle_read_frame_preamble_main read frame length and tag failed r=-1 ((1) Operation not permitted)
2020-05-06T02:34:23.019+0000 7fe45ad55700  1 --2- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >> [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] conn(0x5605838fcc00 0x560582736880 unknown :-1 s=READY pgs=4 cs=0 l=0 rx=0 tx=0)._fault with nothing to send, going to standby
2020-05-06T02:34:23.020+0000 7fe455d4b700  1 --2- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >>  conn(0x5605835e2c00 0x560583472d80 unknown :-1 s=NONE pgs=0 cs=0 l=0 rx=0 tx=0).accept
2020-05-06T02:34:23.020+0000 7fe45ad55700  1 --2- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >>  conn(0x5605835e2c00 0x560583472d80 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rx=0 tx=0)._handle_peer_banner_payload supported=0 required=0
2020-05-06T02:34:23.020+0000 7fe45ad55700  0 -- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >>  conn(0x5605835e2c00 msgr2=0x560583472d80 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until injecting socket failure
2020-05-06T02:34:23.020+0000 7fe45ad55700  1 -- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >>  conn(0x5605835e2c00 msgr2=0x560583472d80 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_bulk peer close file descriptor 34
2020-05-06T02:34:23.020+0000 7fe45ad55700  1 -- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >>  conn(0x5605835e2c00 msgr2=0x560583472d80 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until read failed
2020-05-06T02:34:23.020+0000 7fe45ad55700  1 --2- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >>  conn(0x5605835e2c00 0x560583472d80 unknown :-1 s=AUTH_ACCEPTING pgs=0 cs=0 l=0 rx=0 tx=0).handle_read_frame_preamble_main read frame length and tag failed r=-1 ((1) Operation not permitted)
2020-05-06T02:34:23.020+0000 7fe45ad55700  1 --2- [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] >>  conn(0x5605835e2c00 0x560583472d80 unknown :-1 s=AUTH_ACCEPTING pgs=0 cs=0 l=0 rx=0 tx=0).stop
2020-05-06T02:34:23.020+0000 7fe456d4d700 10 mon.a@0(electing) e1 ms_handle_reset 0x5605835e2c00 -

From the log of 'b'

2020-05-06T02:34:23.018+0000 7fc09da60700 10 paxos.1).electionLogic(6) bump_epoch6 to 7
2020-05-06T02:34:23.018+0000 7fc0a2269700  1 --2- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] >> [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] conn(0x5567c7ff2c00 0x5567c6e2c880 unknown :-1 s=THROTTLE_DONE pgs=4 cs=0 l=0 rx=0x5567c7b4c300 tx=0x5567c7b33920).handle_read_frame_epilogue_main read frame epilogue bytes=32
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing) e1 join_election
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing) e1 _reset
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).auth v2 _set_mon_num_rank num 0 rank 0
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing) e1 cancel_probe_timeout (none scheduled)
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing) e1 timecheck_finish
2020-05-06T02:34:23.019+0000 7fc09da60700 15 mon.b@1(electing) e1 health_tick_stop
2020-05-06T02:34:23.019+0000 7fc09da60700 15 mon.b@1(electing) e1 health_interval_stop
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing) e1 scrub_event_cancel
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing) e1 scrub_reset
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).paxos(paxos recovering c 1..8) restart -- canceling timeouts
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).paxosservice(mdsmap 1..1) restart
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).paxosservice(osdmap 1..2) restart
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).paxosservice(logm 1..3) restart
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).paxosservice(monmap 1..1) restart
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).paxosservice(auth 1..2) restart
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).paxosservice(mgr 1..1) restart
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).paxosservice(mgrstat 1..1) restart
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).paxosservice(health 1..1) restart
2020-05-06T02:34:23.019+0000 7fc09da60700 10 mon.b@1(electing).paxosservice(config 1..2) restart
2020-05-06T02:34:23.019+0000 7fc09da60700  1 -- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] send_to--> mon [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] -- election(a2feb1eb-2f05-4b4f-bae6-96248acb7c3c propose rel 15 e7) v8 -- ?+0 0x5567c6dcdb00
2020-05-06T02:34:23.019+0000 7fc09da60700  1 -- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] --> [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] -- election(a2feb1eb-2f05-4b4f-bae6-96248acb7c3c propose rel 15 e7) v8 -- 0x5567c6dcdb00 con 0x5567c7ff2c00
2020-05-06T02:34:23.019+0000 7fc09da60700  1 -- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] send_to--> mon [v2:172.21.15.133:3302/0,v1:172.21.15.133:6791/0] -- election(a2feb1eb-2f05-4b4f-bae6-96248acb7c3c propose rel 15 e7) v8 -- ?+0 0x5567c7c2e6c0
2020-05-06T02:34:23.019+0000 7fc09da60700  1 -- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] --> [v2:172.21.15.133:3302/0,v1:172.21.15.133:6791/0] -- election(a2feb1eb-2f05-4b4f-bae6-96248acb7c3c propose rel 15 e7) v8 -- 0x5567c7c2e6c0 con 0x5567c7ff2000
2020-05-06T02:34:23.019+0000 7fc09ca5e700  1 --2- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] >> [v2:172.21.15.133:3302/0,v1:172.21.15.133:6791/0] conn(0x5567c7ff2000 0x5567c6e2cd00 secure :-1 s=READY pgs=6 cs=0 l=0 rx=0x5567c7b4c0f0 tx=0x5567c7b92000).handle_read_frame_epilogue_main read frame epilogue bytes=32
2020-05-06T02:34:23.019+0000 7fc0a2269700  1 --2- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] >> [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] conn(0x5567c7ff2c00 0x5567c6e2c880 unknown :-1 s=READY pgs=4 cs=0 l=0 rx=0x5567c7b4c300 tx=0x5567c7b33920).handle_read_frame_epilogue_main read frame epilogue bytes=32
2020-05-06T02:34:23.019+0000 7fc0a2269700  0 -- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] >> [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] conn(0x5567c7ff2c00 msgr2=0x5567c6e2c880 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until injecting socket failure
2020-05-06T02:34:23.019+0000 7fc0a2269700  1 -- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] >> [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] conn(0x5567c7ff2c00 msgr2=0x5567c6e2c880 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_bulk peer close file descriptor 31
2020-05-06T02:34:23.019+0000 7fc0a2269700  1 -- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] >> [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] conn(0x5567c7ff2c00 msgr2=0x5567c6e2c880 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until read failed
2020-05-06T02:34:23.019+0000 7fc0a2269700  1 --2- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] >> [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] conn(0x5567c7ff2c00 0x5567c6e2c880 unknown :-1 s=READY pgs=4 cs=0 l=0 rx=0x5567c7b4c300 tx=0x5567c7b33920).handle_read_frame_preamble_main read frame length and tag failed r=-1 ((1) Operation not permitted)
2020-05-06T02:34:23.019+0000 7fc0a2269700  1 --2- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] >> [v2:172.21.15.133:3300/0,v1:172.21.15.133:6789/0] conn(0x5567c7ff2c00 0x5567c6e2c880 unknown :-1 s=READY pgs=4 cs=0 l=0 rx=0 tx=0)._fault with nothing to send, going to standby
2020-05-06T02:34:23.019+0000 7fc09da60700  1 -- [v2:172.21.15.133:3301/0,v1:172.21.15.133:6790/0] <== mon.1 v2:172.21.15.133:3301/0 0 ==== log(1 entries from seq 4 at 2020-05-06T02:34:23.018657+0000) v1 ==== 0+0+0 (unknown 0 0 0) 0x5567c7bf8c40 con 0x5567c6ea5400

This seems to be related to the failure injection and we probably need to whitelist 'MON_DOWN' warnings in the cases where we are injecting failures.

Related issues 3 (0 open — 3 closed)

Actions

Copy link

Updated by Brad Hubbard almost 4 years ago

Description updated (diff)

Actions

Copy link

Updated by Sridhar Seshasayee almost 4 years ago

Observing the issue during this run:
/a/sseshasa-2020-06-24_17:46:09-rados-wip-sseshasa-testing-2020-06-24-1858-distro-basic-smithi/5176239

Failure Reason:
"2020-06-24T20:49:52.611951+0000 mon.a (mon.0) 40 : cluster [WRN] Health check failed: 1/3 mons down, quorum a,b (MON_DOWN)" in cluster log

job ID: 5176239

Actions

Copy link

Updated by Brad Hubbard over 3 years ago

/a/yuriw-2020-08-06_00:31:28-rados-wip-yuri8-testing-octopus-distro-basic-smithi/5290942

Actions

Copy link

Updated by Deepika Upadhyay over 3 years ago

yuriw-2020-08-20_19:48:15-rados-wip-yuri-testing-2020-08-17-1723-octopus-distro-basic-smithi/5362945/

Actions

Copy link

Updated by Neha Ojha over 3 years ago

Backport set to octopus

/a/teuthology-2020-09-22_07:01:02-rados-master-distro-basic-smithi/5458722

Actions

Copy link

Updated by Neha Ojha over 3 years ago

Neha Ojha wrote:

/a/teuthology-2020-09-22_07:01:02-rados-master-distro-basic-smithi/5458722

rados/thrash/{0-size-min-size-overrides/3-size-2-min-size 1-pg-log-overrides/short_pg_log 2-recovery-overrides/{more-async-recovery} backoff/peering_and_degraded ceph clusters/{fixed-2 openstack} crc-failures/default d-balancer/on mon_election/connectivity msgr-failures/osd-delay msgr/async objectstore/bluestore-comp-snappy rados supported-random-distro$/{rhel_8} thrashers/none thrashosds-health workloads/cache-pool-snaps}

Actions

Copy link

Updated by Neha Ojha over 3 years ago

rados/singleton/{all/osd-recovery-incomplete mon_election/connectivity msgr-failures/many msgr/async-v1only objectstore/bluestore-comp-lz4 rados supported-random-distro$/{rhel_8}}

/a/teuthology-2020-09-22_07:01:02-rados-master-distro-basic-smithi/5458825

Actions

Copy link

Updated by Neha Ojha over 3 years ago

rados/singleton/{all/osd-recovery-incomplete mon_election/connectivity msgr-failures/many msgr/async-v2only objectstore/bluestore-comp-zlib rados supported-random-distro$/{rhel_8}}

/a/teuthology-2020-09-30_07:01:02-rados-master-distro-basic-smithi/5483697

Actions

Copy link

Updated by Neha Ojha over 3 years ago

Priority changed from Normal to Urgent

Actions

Copy link

#10

Updated by Neha Ojha over 3 years ago

rados/thrash-erasure-code-overwrites/{bluestore-bitmap ceph clusters/{fixed-2 openstack} fast/fast mon_election/connectivity msgr-failures/osd-delay rados recovery-overrides/{more-async-recovery} supported-random-distro$/{rhel_8} thrashers/morepggrow thrashosds-health workloads/ec-pool-snaps-few-objects-overwrites}

/a/teuthology-2020-10-21_07:01:02-rados-master-distro-basic-smithi/5545056

Actions

Copy link

#11

Updated by Brad Hubbard over 3 years ago

Assignee deleted (~~Brad Hubbard~~)

Actions

Copy link

#12

Updated by Brad Hubbard over 3 years ago

Assignee set to Brad Hubbard

Actions

Copy link

#13

Updated by Neha Ojha over 3 years ago

rados/singleton/{all/mon-config mon_election/connectivity msgr-failures/many msgr/async objectstore/bluestore-comp-lz4 rados supported-random-distro$/{centos_8}}

/a/teuthology-2020-10-28_07:01:02-rados-master-distro-basic-smithi/5567279

Actions

Copy link

#14

Updated by Neha Ojha over 3 years ago

Has duplicate Bug #44694: MON_DOWN during cluster setup added

Actions

Copy link

#15

Updated by Brad Hubbard over 3 years ago

/a/teuthology-2020-10-28_07:01:02-rados-master-distro-basic-smithi/5567279 shows the following at the time the failure was reported.

2020-10-28T08:59:57.904+0000 7fc52cda0700 10 mon.c@2(peon).elector(6) handle_ping mon_ping(ping_reply stamp 2020-10-28T08:59:57.904334+0000) v1
2020-10-28T08:59:57.945+0000 7fc52bd9e700  1 --2- [v2:172.21.15.188:3302/0,v1:172.21.15.188:6791/0] >>  conn(0x558c45bb2400 0x558c45bb7900 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0).accept
2020-10-28T08:59:57.945+0000 7fc530da8700  1 --2- [v2:172.21.15.188:3302/0,v1:172.21.15.188:6791/0] >>  conn(0x558c45bb2400 0x558c45bb7900 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0)._handle_peer_banner_payload supported=1 required=0
2020-10-28T08:59:57.945+0000 7fc530da8700  0 -- [v2:172.21.15.188:3302/0,v1:172.21.15.188:6791/0] >>  conn(0x558c45bb2400 msgr2=0x558c45bb7900 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read_until injecting socket failure
2020-10-28T08:59:57.945+0000 7fc530da8700  1 -- [v2:172.21.15.188:3302/0,v1:172.21.15.188:6791/0] >>  conn(0x558c45bb2400 msgr2=0x558c45bb7900 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=1).read_bulk peer close file descriptor 37
2020-10-28T08:59:57.945+0000 7fc530da8700  1 -- [v2:172.21.15.188:3302/0,v1:172.21.15.188:6791/0] >>  conn(0x558c45bb2400 msgr2=0x558c45bb7900 unknown :-1 s=STATE_CONNECTION_ESTABLISHED l=1).read_until read failed
2020-10-28T08:59:57.945+0000 7fc530da8700  1 --2- [v2:172.21.15.188:3302/0,v1:172.21.15.188:6791/0] >>  conn(0x558c45bb2400 0x558c45bb7900 unknown :-1 s=AUTH_ACCEPTING pgs=0 cs=0 l=1 rev1=1 rx=0 tx=0).handle_read_frame_preamble_main read frame preamble failed r=-1 ((1) Operation not permitted)
2020-10-28T08:59:57.945+0000 7fc530da8700  1 --2- [v2:172.21.15.188:3302/0,v1:172.21.15.188:6791/0] >>  conn(0x558c45bb2400 0x558c45bb7900 unknown :-1 s=AUTH_ACCEPTING pgs=0 cs=0 l=1 rev1=1 rx=0 tx=0).stop
2020-10-28T08:59:57.945+0000 7fc52cda0700 10 mon.c@2(peon) e1 ms_handle_reset 0x558c45bb2400 -
2020-10-28T08:59:58.346+0000 7fc52bd9e700  1 --2- [v2:172.21.15.188:3302/0,v1:172.21.15.188:6791/0] >>  conn(0x558c45bb2000 0x558c45bb7400 unknown :-1 s=NONE pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0).accept
2020-10-28T08:59:58.346+0000 7fc530da8700  1 --2- [v2:172.21.15.188:3302/0,v1:172.21.15.188:6791/0] >>  conn(0x558c45bb2000 0x558c45bb7400 unknown :-1 s=BANNER_ACCEPTING pgs=0 cs=0 l=0 rev1=0 rx=0 tx=0)._handle_peer_banner_payload supported=1 required=0

I guess we could try something like https://github.com/ceph/ceph/pull/30714/commits/ded58ef91d6c8a68de49fa2c6b6e01636515c59b with 'MON_DOWN' ? The question is is this going to mask certain issues and why haven't the 'log-ignorelist' entries been required all along?

Actions

Copy link

#16

Updated by Neha Ojha over 3 years ago

rados/singleton/{all/lost-unfound-delete mon_election/connectivity msgr-failures/many msgr/async-v1only objectstore/bluestore-comp-zstd rados supported-random-distro$/{ubuntu_latest}}

/a/teuthology-2020-12-10_07:01:01-rados-master-distro-basic-smithi/5697910

Actions

Copy link

#17

Updated by Neha Ojha over 3 years ago

Assignee changed from Brad Hubbard to Neha Ojha

Actions

Copy link

#18

Updated by Neha Ojha over 3 years ago

rados/singleton-nomsgr/{all/recovery-unfound-found mon_election/connectivity rados supported-random-distro$/{centos_8}}

/a/teuthology-2021-01-11_07:01:02-rados-master-distro-basic-smithi/5777338/

Actions

Copy link

#19

Updated by Neha Ojha over 3 years ago

The rados/singleton MON_DOWN failures with msgr-failures/many should be safe to ignore.

/a/nojha-2021-01-20_23:46:47-rados-wip-32027-distro-basic-smithi/5806233/
/a/nojha-2021-01-20_23:46:47-rados-wip-32027-distro-basic-smithi/5806155/

Actions

Copy link

#20

Updated by Deepika Upadhyay about 3 years ago

 description: rados/singleton/{all/osd-recovery-incomplete msgr-failures/many msgr/async-v1only
    objectstore/bluestore-comp-zlib rados supported-random-distro$/{centos_latest}}

http://qa-proxy.ceph.com/teuthology/yuriw-2021-01-28_19:54:33-rados-wip-yuri4-testing-2021-01-28-0959-octopus-distro-basic-smithi/5835908/teuthology.log

Actions

Copy link

#21

Updated by Sage Weil about 3 years ago

Status changed from New to Fix Under Review
Pull request ID set to 39586

Actions

Copy link

#22

Updated by Sage Weil about 3 years ago

Backport changed from octopus to pacific,octopus

Actions

Copy link

#23

Updated by Kefu Chai about 3 years ago

Status changed from Fix Under Review to Pending Backport

Actions

Copy link

#24

Updated by Backport Bot about 3 years ago

Copied to Backport #49401: pacific: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log' added

Actions

Copy link

#25

Updated by Backport Bot about 3 years ago

Copied to Backport #49402: octopus: rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log' added

Actions

Copy link

#26

Updated by Loïc Dachary about 3 years ago

Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #45441

rados: Health check failed: 1/3 mons down, quorum a,c (MON_DOWN)" in cluster log'

Updated by Brad Hubbard almost 4 years ago

Updated by Sridhar Seshasayee almost 4 years ago

Updated by Brad Hubbard over 3 years ago

Updated by Deepika Upadhyay over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Brad Hubbard over 3 years ago

Updated by Brad Hubbard over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Brad Hubbard over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Neha Ojha over 3 years ago

Updated by Deepika Upadhyay about 3 years ago

Updated by Sage Weil about 3 years ago

Updated by Sage Weil about 3 years ago

Updated by Kefu Chai about 3 years ago

Updated by Backport Bot about 3 years ago

Updated by Backport Bot about 3 years ago

Updated by Loïc Dachary about 3 years ago