Bug #22142
Updated by Jan Fajerski over 6 years ago
To reproduce: <pre><code class="text"> #start a vstart cluster ../src/vstart.sh -n -s -d #start the prometheus module for health status, the dashboard shows show the same info bin/ceph mgr module enable prometheus # confirm healthy cluster state curl 192.168.178.4:9283/metrics | grep "ceph_health_status 0.0" # kill a mon and wait a bit for status to change kill `cat out/mon.a.pid` sleep 1m # check of health warn bin/ceph -s # mgr modules still show healthy curl 192.168.178.4:9283/metrics | grep "ceph_health_status 1.0" # 1.0 is warn curl 192.168.178.4:9283/metrics | grep "ceph_health_status 0.0" # 0.0 is healthy </code></pre> Alternatively check the mgr dashboard (see screenshot). It seems like sometimes the the status propagates correctly, i.e. the dashboard and prometheus module show the WARN state