Bug #48567
closedoctopus: "cluster [WRN] Health detail: HEALTH_WARN 1/3 mons down, quorum c,a" in upgrade:nautilus-x-octopus
0%
Description
This is for 15.2.8 release
Runs:
https://pulpito.ceph.com/teuthology-2020-12-10_16:13:48-upgrade:nautilus-x-octopus-distro-basic-smithi/
https://pulpito.ceph.com/yuriw-2020-12-09_23:07:56-upgrade:nautilus-x-octopus-distro-basic-smithi/
Jobs:
['5696843', '5696844']
['5698719', '5698727']
Logs:
/a/teuthology-2020-12-10_16:13:48-upgrade:nautilus-x-octopus-distro-basic-smithi/5698719/teuthology.log
/a/yuriw-2020-12-09_23:07:56-upgrade:nautilus-x-octopus-distro-basic-smithi/5696843/teuthology.log
failure_reason: '"2020-12-10T21:56:51.664776+0000 mon.c (mon.1) 115 : cluster [WRN] Health detail: HEALTH_WARN 1/3 mons down, quorum c,a" in cluster log'
Seems to be introduced after 11/27/20
https://pulpito.ceph.com/?suite=upgrade%3Anautilus-x&branch=octopus
Updated by Neha Ojha over 3 years ago
- Subject changed from "cluster [WRN] Health detail: HEALTH_WARN 1/3 mons down, quorum c,a" in upgrade:nautilus-x-octopus to octopus: "cluster [WRN] Health detail: HEALTH_WARN 1/3 mons down, quorum c,a" in upgrade:nautilus-x-octopus
Updated by Neha Ojha over 3 years ago
- Status changed from New to Triaged
- Priority changed from Urgent to Normal
I think the problem is that https://github.com/ceph/ceph/pull/38118 has merged in nautilus while https://github.com/ceph/ceph/pull/38345 hasn't merged in octopus. When doing an upgrade test from N->O, the nautilus version adds extra health detail to clog but does not set "mon health detail to clog = false" because the test suite being used is octopus, which still doesn't have that change.
The fact that this warning only appears while the mons are running nautilus proves the above theory.
2020-12-10 21:56:51.662 7f4ec0c3e700 0 log_channel(cluster) log [WRN] : Health detail: HEALTH_WARN 1/3 mons down, quorum c,a . . 2020-12-10 21:56:51.806 7f4ec2441700 7 mon.c@1(peon).log v68 update_from_paxos applying incremental log 68 2020-12-10 21:56:51.749969 mon.b (mon.0) 8 : cluster [INF] Health check cleared: MON_DOWN (was: 1/3 mons down, quorum c,a) . . 2020-12-10T22:24:33.202+0000 7f776b6b2540 0 ceph version 15.2.7-678-gd0234e1e (d0234e1ede269851a029cd700e16721cae756cc6) octopus (stable), process ceph-mon, pid 16286
https://github.com/ceph/ceph/pull/38118 merged 11 days ago which aligns with "Seems to be introduced after 11/27/20 https://pulpito.ceph.com/?suite=upgrade%3Anautilus-x&branch=octopus"
Updated by Yuri Weinstein over 3 years ago
Updated by Nathan Cutler over 3 years ago
- Status changed from Triaged to Resolved
- Pull request ID set to 38345
@Neha ., so this is fixed by https://github.com/ceph/ceph/pull/38345 ?