Feature #53919
openFeature #47072: mgr/dashboard: Usability Improvements
mgr/dashboard: hint users about host resource utilization and service allocation
0%
Description
AKA: "moving a daemon to another host to help balance resources across the cluster"
However, IMHO this kind of manual allocation/pinning should be avoided. Ideally, the cephadm scheduler should be smart enough to allocate services according to resources:- Either statically at service creation time
- Or dynamically if the resource utilization changes over time.
Since, none of the above described features are implemented yet in the dashboard, we should define a way to manually allocate daemons (service instances) to less loaded nodes.
In the absence of dynamic service scheduling, we would expect the following user workflow:
1. In case a host reaches some (non instant, but 1 min-average) threshold in resource utilization (85% RSS memory, CPU or network), an alert should be triggered.
2. Optionally, in the hosts panel we could highlight the affected hosts (however, so far we're not printing the real-time stats but the gather facts ones, which are static).
3. The user could reallocate some (ideally the most resource consuming) service to other nodes, by using direct placement/pinning (e.g.: host7, host8, host9) or labels.
The more I read the above, the more convinced I am that this is a wrong approach:
- It's not trivial to implement (since the dashboard will still need to go through the orchestrator for gathering the real-time resource stats),
- To provide a decent user-experience (e.g.: hinting the user which host is affected, which service they should move or to which host), it almost requires the same amount of complex logic than doing a proper cephadm dynamic scheduler.
In any case, if we are ok with just dumb service allocation, we don't really need to implement much code: just editing a service and changing the service placement spec is enough.