Actions
Bug #57969
openmonitor: ceph -s shows all monitors out of quorum for < 1s
Status:
New
Priority:
Low
Assignee:
-
Category:
Monitor
Target version:
-
% Done:
0%
Source:
Tags:
low-hanging-fruit
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Ceph -s UI shows all monitors out of quorum for a very short time < 1s.
Issue is like to have no real effect on the cluster but this could potential confuse the user
and trigger false alarm.
First recorded observation when:
Stretch-Cluster, 5 MONs 4 OSDs, 2 stretch buckets
The problem is not very deterministic, recommend using a watch command on ceph -s when trying to reproduce.
Fail 1 zone
cluster: id: 385a428b-c9a6-475a-83f7-172fc2e9973a health: HEALTH_WARN We are missing stretch mode buckets, only requiring 1 of 2 buckets to peer 2/5 mons down, quorum a,b,e 2 osds down 2 hosts (2 osds) down 1 zone (2 osds) down services: mon: 5 daemons, quorum (age 18h), out of quorum: a, b, e, f, g mgr: a(active, since 55m), standbys: b mds: 1/1 daemons up, 1 hot standby osd: 4 osds: 2 up (since 0.224087s), 4 in (since 23h) rgw: 2 daemons active (2 hosts, 1 zones) data: volumes: 1/1 healthy pools: 11 pools, 177 pgs objects: 553 objects, 324 MiB usage: 2.7 GiB used, 397 GiB / 400 GiB avail pgs: 177 active+clean
No data to display
Actions