Bug #22992
mon: add RAM usage (including avail) to HealthMonitor::check_member_health?
0%
Description
I'm looking into several MON_DOWN failures from
It's been suggested before that this may be due to memory pressure on the machine since so many daemons are hosted alongside the mons.
It'd be useful to get the memory usage of the mon and the available memory on the system periodically to verify this but also to detect low memory in deployments. I think the natural place to do this is in HealthMonitor::check_member_health where we already check disk space:
2018-02-13 04:53:40.993 7f982eb8c700 10 mon.c@2(peon).health check_member_health avail 99% total 15250 MB, used 110 MB, avail 15139 MB
Thoughts?
History
#1 Updated by Patrick Donnelly about 6 years ago
Turned out it was just the monitor being thrashed (didn't realize we were doing that in kcephfs!): #22993
Still, memory usage checking may be useful!