monitor: ceph -s shows all monitors out of quorum for < 1s
Ceph -s UI shows all monitors out of quorum for a very short time < 1s.
Issue is like to have no real effect on the cluster but this could potential confuse the user
and trigger false alarm.
First recorded observation when:
Stretch-Cluster, 5 MONs 4 OSDs, 2 stretch buckets
The problem is not very deterministic, recommend using a watch command on ceph -s when trying to reproduce.
Fail 1 zone
cluster: id: 385a428b-c9a6-475a-83f7-172fc2e9973a health: HEALTH_WARN We are missing stretch mode buckets, only requiring 1 of 2 buckets to peer 2/5 mons down, quorum a,b,e 2 osds down 2 hosts (2 osds) down 1 zone (2 osds) down services: mon: 5 daemons, quorum (age 18h), out of quorum: a, b, e, f, g mgr: a(active, since 55m), standbys: b mds: 1/1 daemons up, 1 hot standby osd: 4 osds: 2 up (since 0.224087s), 4 in (since 23h) rgw: 2 daemons active (2 hosts, 1 zones) data: volumes: 1/1 healthy pools: 11 pools, 177 pgs objects: 553 objects, 324 MiB usage: 2.7 GiB used, 397 GiB / 400 GiB avail pgs: 177 active+clean