Actions
Support #47233
closedcephadm: orch apply mon "label:osd" crashes cluster
% Done:
0%
Tags:
Reviewed:
Affected Versions:
Pull request ID:
Description
I have a virtual Ceph cluster with 6 VMs/Hosts running on Ubuntu Server 20.04. The cluster is running on Podman.
Three hosts are Mons and three hosts are OSDs with a corresponding label each:
HOST ADDR LABEL
mon-01 mon-01 mon
mon-02 mon-02 mon
mon-03 mon-03 mon
osd-01 osd-01 osd
osd-02 osd-02 osd
osd-03 osd-03 osd
The virtual cluster is running and everything is OK/Healthy.
Now I enter a command in the orchestrator with an error in terms of the label used:
sudo ceph orch apply mon "label:osd"
A few seconds later, the cluster basically offline. Administrative commands as well as any reporting do not work.
A check of the running Podman containers on each VM shows that the Mon-Containers are gone from all Mon-Hosts while they're present on all OSD-Hosts. Other Monitor related containers like MDS, Grafana and MGR are still running on the Mon-Hosts.
A reboot didn't resolve the problem. I also tried to install cephadm and ceph-common via cephadm on all OSDs (including copying all necessary files like ceph admin keyring) in the hope to resolve the issue this way but it also didn't help.
Actions