Actions
Bug #49371
openMisleading alarm if all MDS daemons have failed
% Done:
0%
Source:
Community (user)
Tags:
Backport:
pacific,octopus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Seen on ceph v14.2.9 in a containerised cluster with 3 MDS nodes
Both standby MGR containers are manually stopped. ceph reports a sensible alarm:
With only 1 MDS remaining we have an alarm on ceph health:
health: HEALTH_WARN
insufficient standby MDS daemons available
Then I manually stop the final, active MDS damon.
Expected:
`ceph health` should report an alarm that there are no active MDS daemons and all filesystems are degraded / inactive.
Actual:
`ceph health` continues to report "insufficent standby". There are no new alarms about the total lack of active MDS daemons.
health: HEALTH_WARN
insufficient standby MDS daemons available
ceph status shows:
mds: cephfs:1 {0=albamons_sc2=up:active(laggy or crashed)}
If I then stop the active (and only remaining) MGR, we got an alarm reported on ceph health:
health: HEALTH_WARN
no active mgr
Actions