Bug #15874
openUpon hammer->jewel upgrade, OSD cannot access journal device until after reboot
0%
Description
Scenario: hammer cluster being upgraded to jewel. Before upgrade, the permissions/ownership of the journal device are "660/root:disk". On each node, the packages are upgraded, the daemons are stopped, /var/lib/ceph is chowned, and the daemons are started again.
Problem: when the jewel OSDs start, they are running as ceph:ceph and fail to open their journal devices (permission denied).
The problem only lasts until the node is rebooted.
Solution: the only solution that occurs to me is to do a "udevadm trigger" in the postinst scripts. I tested this manually and it works like a charm. In this example, node2
was just upgraded from hammer to jewel and /dev/sdb2
is a journal device:
node2:~ # ls -l /dev/sdb2 brw-rw---- 1 root disk 8, 18 May 12 15:49 /dev/sdb2 node2:~ # udevadm trigger node2:~ # ls -l /dev/sdb2 brw-rw---- 1 ceph ceph 8, 18 May 12 22:26 /dev/sdb2
Updated by Nathan Cutler almost 8 years ago
- Related to Feature #15733: ceph-osd should chown OSD data when --setuser is specified added
Updated by Nathan Cutler almost 8 years ago
- Related to deleted (Feature #15733: ceph-osd should chown OSD data when --setuser is specified)
Updated by Nathan Cutler almost 8 years ago
There is a concern that running udevadm trigger
unconditionally could cause "churn" on a node with many devices.
Updated by Daniel Kraft almost 8 years ago
Workaround:
# sudo systemctl edit ceph-osd@.service [Service] ExecStartPre=/bin/chown ceph:ceph /var/lib/ceph/osd/ceph-%i/journal
This will add an additional pre-start step when an osd is started without touching the original unit file.
Updated by alexander walker over 7 years ago
I've also issue with the permissions of journal parition. I use m2 SSD for journal with the name /dev/nvme0n1p4 and /dev/nvme0n1p5.
Workaround:
Set permissions "chown -R ceph:ceph /dev/nvme0n1p4" and "chown -R ceph:ceph /dev/nvme0n1p5"