Project

General

Profile

Actions

Feature #47038

open

cephadm: Automatically deploy failed daemons on other hosts

Added by Sebastian Wagner over 3 years ago. Updated over 2 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
cephadm/scheduler
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

currently cephadm doesn't automatically re-distribute containers to new hosts. Right now, this is a manual step.

lots of open questions here:

  • when exactly has a daemon failed?
  • do we need a timeout?
  • what about stopping a daemon on purpose?
  • This has the potential to really badly break a cluster, if newly created MONs won't properly form a quorum.
  • What if newly added daemons fail as well?

Related issues 4 (0 open4 closed)

Related to Orchestrator - Feature #47782: ceph orch host rm <host> is not stopping the services deployed in the respective removed hostsDuplicate

Actions
Related to Orchestrator - Feature #48624: ceph orch drain <host>ResolvedDaniel Pivonka

Actions
Related to Orchestrator - Bug #43838: cephadm: Forcefully Remove Services (unresponsive hosts)Can't reproduce

Actions
Has duplicate Orchestrator - Feature #53378: cephadm: redeploy nfs-ganesha service that was running in a host that went offlineDuplicate

Actions
Actions

Also available in: Atom PDF