Project

General

Profile

Actions

Bug #45465

closed

cephadm: `ceph orch restart osd` has the potential to break your cluster

Added by Sebastian Wagner almost 4 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
cephadm
Target version:
-
% Done:

0%

Source:
Tags:
ux
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Multiple bugs here:

  • the cephadm implementation doesn't check anything. (ceph osd ok-to-stop...., HEALTH_ERR)
  • we're doing O(hosts) networks calls, rendering cephadm completely unresponsive (even ceph orch ps will not respond)
  • no parallel vs sequential. parallel might be dangerous. sequential is safe.
  • --force if HEALTH_ERR. cause users might actually need to restart something in order to recover from it.

Related issues 2 (0 open2 closed)

Related to Orchestrator - Bug #46813: `ceph orch * --refresh` is brokenResolved

Actions
Related to Orchestrator - Bug #47332: repo_digest: Follow up'sResolved

Actions
Actions

Also available in: Atom PDF