Project

General

Profile

Feature #53919

Feature #47072: mgr/dashboard: Usability Improvements

mgr/dashboard: hint users about host resource utilization and service allocation

Added by Ernesto Puerta 10 months ago. Updated 10 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Component - Orchestrator
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

AKA: "moving a daemon to another host to help balance resources across the cluster"

However, IMHO this kind of manual allocation/pinning should be avoided. Ideally, the cephadm scheduler should be smart enough to allocate services according to resources:
  • Either statically at service creation time
  • Or dynamically if the resource utilization changes over time.

Since, none of the above described features are implemented yet in the dashboard, we should define a way to manually allocate daemons (service instances) to less loaded nodes.

In the absence of dynamic service scheduling, we would expect the following user workflow:
1. In case a host reaches some (non instant, but 1 min-average) threshold in resource utilization (85% RSS memory, CPU or network), an alert should be triggered.
2. Optionally, in the hosts panel we could highlight the affected hosts (however, so far we're not printing the real-time stats but the gather facts ones, which are static).
3. The user could reallocate some (ideally the most resource consuming) service to other nodes, by using direct placement/pinning (e.g.: host7, host8, host9) or labels.

The more I read the above, the more convinced I am that this is a wrong approach:

  • It's not trivial to implement (since the dashboard will still need to go through the orchestrator for gathering the real-time resource stats),
  • To provide a decent user-experience (e.g.: hinting the user which host is affected, which service they should move or to which host), it almost requires the same amount of complex logic than doing a proper cephadm dynamic scheduler.

In any case, if we are ok with just dumb service allocation, we don't really need to implement much code: just editing a service and changing the service placement spec is enough.

History

#1 Updated by Ernesto Puerta 10 months ago

  • Description updated (diff)

#2 Updated by Ernesto Puerta 10 months ago

  • Description updated (diff)

#3 Updated by Ernesto Puerta 10 months ago

  • Parent task set to #47072

Also available in: Atom PDF