Bug #15554
closedsystemd does not respect the ceph config file when using non-standard directories
0%
Description
This is a 2 node test cluster, single mon, running on the latest jewel release candidate with debian jessie.
In my config, I mount the osds, store the mon data and mds data to a directory under /srv/ceph/osd/osd.$id, /srv/ceph/mon/mon.$id, /srv/ceph/mds/mds.$id, respectively.
systemd fails to start any osds on my second node (the one running osds only), looking at the output of systemctl status ceph-osd@7, etc. I see that it is trying to use the default directories instead of the ones specified in /etc/ceph/ceph.conf.
I have another 3 node production firefly cluster that works perfectly with this config on debian wheezy.
in order to start the osd daemons, I have to run ceph-osd -i 7 --setuser ceph --setgroup ceph (for each OSD).
I also had this issue on infernalis, but I used some dirty hacks in the init scripts to temporarily work around the issue.
I do not use ceph deploy anywhere on my setup.
Updated by Heath Jepson about 8 years ago
The exact change I made to fix this was to edit /usr/lib/ceph/ceph-osd-prestart.sh
changing
data="/var/lib/ceph/osd/${cluster:-ceph}-$id"
journal="$data/journal"
to
data="/srv/ceph/osd/osd.$id"
journal="/dev/disk/by-partlabel/Ceph_SSD_journal.$id"
it's not a proper fix, but in reality this should be reading the config and not using hard-coded values.
Updated by Heath Jepson about 8 years ago
I can't be sure because systemd is royally messed up after upgrading from 10.1.2 to 10.2.0, but I noticed that my ceph config was DOS formatted because it edited it in wordpad once. I don't know if this would have caused the trouble.
You can probably close this report, I have no way to test anything now because, systemd is not having any of it, requiring me to manually mount my data drives and start the ceph daemons by directly running ceph-mon, ceph-osd, etc.