Actions
Bug #44677
closedstale scrub status entry from a failed mds shows up in `ceph status`
Status:
Resolved
Priority:
Normal
Assignee:
Category:
Administration/Usability
Target version:
% Done:
0%
Source:
Community (dev)
Tags:
Backport:
nautilus,octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
This happens intermittently. When an active mds (mds.b) is terminated, mds.c transitions to active, but task status shows both MDSs scrub status:
ceph -s ... ... services: mon: 1 daemons, quorum a (age 24s) mgr: x(active, since 18s) mds: a:1 {0=b=up:active} 2 up:standby osd: 3 osds: 3 up (since 18s), 3 in (since 2h) task status: scrub status: mds.b: idle mds.c: idle ... ...
ceph-mgr should ideally prune older entries after `mgr_service_beacon_grace` seconds, but that doesn't happen. The issue is that ceph-mgr receives an updated fsmap and removes entries from it's tracking index (`daemon_state`). However, `DaemonServer::_prune_pending_service_map()` requires the mds entry in the tracking index to prune stale entries from service map. So, those stale entries remain in the service map until ceph-mgr is restarted (or on a failover).
Actions