Cluster monitor troubleshooting documentation outdated?
I tried to restore an unhealthy virtual cluster by using the following documentation:
When I tried to follow the steps necessary I already stumbled on the first step ("Stop all ceph-mon daemons on all monitor hosts") because there are no ceph-services to be found on the host (problably since Octopus).
I then tried to to at least extract the monmap. But the ceph-mon needs to be installed beforehand (via cephadm?) which isn't mentioned. After I installed ceph-mon the extraction didn't work. It didn't give a warning or error but the monmap wasn't extracted at all. Since I was only experimenting on a virtual cluster I started all nodes and extracted the monmap via the command "ceph mon getmap -o /tmp/monmap" and shut the other two monitors down again.
Then, editing the monmap also required to install monmaptool beforehand (via cephadm which again isn't mentioned?) which I did and then I removed the missing two monitor nodes from it.
Afterwards tried to inject the monmap into the surviving monitor but this didn't work either. The first time I ran the command as described in the doc, resulted in this message (which was never posted again afterwards):
7f3e43545580 -1 monitor data directory at '/var/lib/ceph/mon/ceph-iz-ceph-v1-mon-01' does not exist: have you run 'mkfs'?
Am I doing everything wrong or is the documentation in need of an update? Could some these problems I encountered also be the result of one or more bugs? I don't want to spam the bug tracker so I first post it here.
Thanks a lot!