This can't just be effecting me. after running apt-get update then apt-get dist-upgrade, I'm still in the same boat.
A quick look at debian's repository reveals that the latest version of systemd for jessie is 215, there is nothing useful in the backports.
What is your recommended OS for jewel? What do you guys use internally? I've be using debian and ceph together for the last 3.5 years and it's been perfect until now.
I commented out all occurrences of TasksMax, now my mon starts, but everything else is still dead as a doornail. ceph-disk gives me the following and data drives are not getting mounted:
I have yet to get anything to start without major intervention since upgrading from infernalis to jewel 10.2.0
root@elara:/etc/ceph# systemctl status ceph-osd@0
● ceph-osd@0.service - Ceph object storage daemon
Loaded: loaded (/lib/systemd/system/ceph-osd
.service; disabled)
Active: failed (Result: start-limit) since Sat 2016-04-23 19:10:00 MDT; 44min ago
Process: 4141 ExecStart=/usr/bin/ceph-osd -f --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=1/FAILURE)
Process: 4089 ExecStartPre=/usr/lib/ceph/ceph-osd-prestart.sh --cluster ${CLUSTER} --id %i --setuser ceph --setgroup ceph (code=exited, status=0/SUCCESS)
Main PID: 4141 (code=exited, status=1/FAILURE)
Apr 23 19:10:00 elara systemd1: Unit ceph-osd@0.service entered failed state.
Apr 23 19:10:00 elara systemd1: ceph-osd@0.service start request repeated too quickly, refusing to start.
Apr 23 19:10:00 elara systemd1: Failed to start Ceph object storage daemon.
Apr 23 19:10:00 elara systemd1: Unit ceph-osd@0.service entered failed state.
Apr 23 19:25:07 elara systemd1: ceph-osd@0.service start request repeated too quickly, refusing to start.
Apr 23 19:25:07 elara systemd1: Failed to start Ceph object storage daemon.
root@elara:/etc/ceph# systemctl status ceph-disk@0
● ceph-disk@0.service - Ceph disk activation: /0
Loaded: loaded (/lib/systemd/system/ceph-disk@.service; static)
Active: inactive (dead)
root@elara:/etc/ceph# systemctl start ceph-disk@0
Job for ceph-disk@0.service failed. See 'systemctl status ceph-disk@0.service' and 'journalctl -xn' for details.
root@elara:/etc/ceph# systemctl status ceph-disk@0
● ceph-disk@0.service - Ceph disk activation: /0
Loaded: loaded (/lib/systemd/system/ceph-disk@.service; static)
Active: failed (Result: exit-code) since Sat 2016-04-23 19:55:08 MDT; 4s ago
Process: 10705 ExecStart=/bin/sh -c flock /var/lock/ceph-disk /usr/sbin/ceph-disk --verbose --log-stdout trigger --sync %f (code=exited, status=1/FAILURE)
Main PID: 10705 (code=exited, status=1/FAILURE)
Apr 23 19:55:08 elara sh10705: File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4964, in run
Apr 23 19:55:08 elara sh10705: main(sys.argv[1:])
Apr 23 19:55:08 elara sh10705: File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4915, in main
Apr 23 19:55:08 elara sh10705: args.func(args)
Apr 23 19:55:08 elara sh10705: File "/usr/lib/python2.7/dist-packages/ceph_disk/main.py", line 4347, in main_trigger
Apr 23 19:55:08 elara sh10705: raise Error('unrecognized partition type %s' % parttype)
Apr 23 19:55:08 elara sh10705: ceph_disk.main.Error: Error: unrecognized partition type None
Apr 23 19:55:08 elara systemd1: ceph-disk@0.service: main process exited, code=exited, status=1/FAILURE
Apr 23 19:55:08 elara systemd1: Failed to start Ceph disk activation: /0.
Apr 23 19:55:08 elara systemd1: Unit ceph-disk@0.service entered failed state.
root@elara:/etc/ceph#@
The only way it works is for me to manually mount the osds, which all previous versions of ceph did automatically saving me the trouble of mucking around in /etc/fstab
by running the following 2 commands, I finally end up with a running osd.:
mount -t xfs /dev/disk/by-partlabel/Ceph_OSD.0.XFSdata /srv/ceph/osd/osd.0 -o noatime,nodiratime,logbsize=256k,logbufs=8,allocsize=4M
systemctl status ceph-osd@0
I could not find anyone else reporting this, but I can't be going crazy. My config has worked perfectly all the way through hammer, besides the pidfile issue which was an amazingly simple fix. I'd like to know why ceph is misbehaving so much, when previously, it was the most rock-solid software of this level of complexity I've ever used.