Actions
Bug #53448
closedcephadm: agent failures double reported by two health checks
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Description
Whe nagents are down they are reported in both the agent down and failed daemon health check.
It's only really necessary to have them in one and it can be confusing since the criteria for agent down is different than failed daemon (not reporting in time vs. systemd status) yet being put in the former automatically puts them in the latter.
Example, almost all the "failed cephadm daemon(s)" reported here are just repeat reports of the agents marked
cluster: id: f148c330-47c9-11ec-9f19-1dfe2cdc6a6d health: HEALTH_ERR 126 Cephadm Agent(s) are not reporting. Hosts may be offline Kernel Security Module (SELinux/AppArmor) is inconsistent for 19 hosts 131 failed cephadm daemon(s) failed to probe daemons or devices
Actions