Project

General

Profile

Actions

Support #47233

closed

cephadm: orch apply mon "label:osd" crashes cluster

Added by Gunther Heinrich over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
cephadm
Target version:
% Done:

0%

Tags:
Reviewed:
Affected Versions:
Pull request ID:

Description

I have a virtual Ceph cluster with 6 VMs/Hosts running on Ubuntu Server 20.04. The cluster is running on Podman.
Three hosts are Mons and three hosts are OSDs with a corresponding label each:

HOST    ADDR    LABEL
mon-01  mon-01  mon
mon-02  mon-02  mon
mon-03  mon-03  mon
osd-01  osd-01  osd
osd-02  osd-02  osd
osd-03  osd-03  osd

The virtual cluster is running and everything is OK/Healthy.

Now I enter a command in the orchestrator with an error in terms of the label used:

sudo ceph orch apply mon "label:osd" 

A few seconds later, the cluster basically offline. Administrative commands as well as any reporting do not work.
A check of the running Podman containers on each VM shows that the Mon-Containers are gone from all Mon-Hosts while they're present on all OSD-Hosts. Other Monitor related containers like MDS, Grafana and MGR are still running on the Mon-Hosts.
A reboot didn't resolve the problem. I also tried to install cephadm and ceph-common via cephadm on all OSDs (including copying all necessary files like ceph admin keyring) in the hope to resolve the issue this way but it also didn't help.

Actions

Also available in: Atom PDF