Bug #63319
openDifferent UID of ceph user on system and in containers is causing issues
0%
Description
Hello,
Different UID of ceph user on system and in containers is causing issues when for example ceph-bluestore-tool is used for extending OSD.
I installed new cluster , added nodes and OSD and then extended one OSD disk and wanted to extend OSD. So, I scanned PV (size was increased), then extended logical volume. Then I stopped OSD service managing this disk, used ceph-bluestore-tool to extend OSD. This all worked.
Problem appeared when I tried to start OSD service again. It failed with "permission denied". After some investigation I found out that cause is different UID of ceph user on system and in containers. When I set UID and GID of ceph user to same value as in containers , OSD service started and problem was solved.
@root@ceph01:~# ceph version ceph version 17.2.6 (d7ff0d10654d2280e08f1ab989c7cdf3064446a5) quincy (stable)
@root@ceph03:~# cephadm install ceph-common ceph-osd Installing packages ['ceph-common', 'ceph-osd']... root@ceph03:~#
root@ceph03:~# systemctl stop ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1 root@ceph03:~# ceph-bluestore-tool bluefs-bdev-expand --path /var/lib/ceph/cbd96647-70de-11ee-a34a-df5a876dc2c5/osd.1 inferring bluefs devices from bluestore path 1 : device size 0x257fc00000 : using 0x5c142f000(23 GiB) Expanding DB/WAL... 1 : expanding from 0x1fbfc00000 to 0x257fc00000 2023-10-22T14:32:41.955+0000 7ff7418e6a80 -1 bluestore(/var/lib/ceph/cbd96647-70de-11ee-a34a-df5a876dc2c5/osd.1) _read_bdev_label failed to read from /var/lib/ceph/cbd96647-70de-11ee-a34a-df5a876dc2c5/osd.1: (21) Is a directory 2023-10-22T14:32:41.955+0000 7ff7418e6a80 -1 bluestore(/var/lib/ceph/cbd96647-70de-11ee-a34a-df5a876dc2c5/osd.1) unable to read label for /var/lib/ceph/cbd96647-70de-11ee-a34a-df5a876dc2c5/osd.1: (21) Is a directory root@ceph03:~# systemctl start ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1 loaded active running Ceph crash dump collector root@ceph03:~# systemctl list-units --type=service | grep ceph ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@crash.ceph03.service loaded active running Ceph crash.ceph03 for cbd96647-70de-11ee-a34a-df5a876dc2c5 ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@mon.ceph03.service loaded active running Ceph mon.ceph03 for cbd96647-70de-11ee-a34a-df5a876dc2c5 ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1.service loaded active running Ceph osd.1 for cbd96647-70de-11ee-a34a-df5a876dc2c5 ceph-crash.service loaded active running Ceph crash dump collector root@ceph03:~# root@ceph03:~# systemctl status ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1 × ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1.service - Ceph osd.1 for cbd96647-70de-11ee-a34a-df5a876dc2c5 Loaded: loaded (/etc/systemd/system/ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@.service; enabled; vendor preset: enabled) Active: failed (Result: exit-code) since Sun 2023-10-22 14:33:57 UTC; 59s ago Process: 10878 ExecStart=/bin/bash /var/lib/ceph/cbd96647-70de-11ee-a34a-df5a876dc2c5/osd.1/unit.run (code=exited, status=1/FAILURE) Process: 11198 ExecStopPost=/bin/bash /var/lib/ceph/cbd96647-70de-11ee-a34a-df5a876dc2c5/osd.1/unit.poststop (code=exited, status=0/SUCCESS) Main PID: 10878 (code=exited, status=1/FAILURE) CPU: 198ms Oct 22 14:33:57 ceph03 systemd[1]: ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1.service: Scheduled restart job, restart counter is at 5. Oct 22 14:33:57 ceph03 systemd[1]: Stopped Ceph osd.1 for cbd96647-70de-11ee-a34a-df5a876dc2c5. Oct 22 14:33:57 ceph03 systemd[1]: ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1.service: Start request repeated too quickly. Oct 22 14:33:57 ceph03 systemd[1]: ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1.service: Failed with result 'exit-code'. Oct 22 14:33:57 ceph03 systemd[1]: Failed to start Ceph osd.1 for cbd96647-70de-11ee-a34a-df5a876dc2c5. root@ceph03:~# systemctl reset-failed ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1 root@ceph03:~# systemctl start ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1 root@ceph03:~# systemctl status ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1 ● ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@osd.1.service - Ceph osd.1 for cbd96647-70de-11ee-a34a-df5a876dc2c5 Loaded: loaded (/etc/systemd/system/ceph-cbd96647-70de-11ee-a34a-df5a876dc2c5@.service; enabled; vendor preset: enabled) Active: activating (auto-restart) (Result: exit-code) since Sun 2023-10-22 14:35:50 UTC; 8s ago Process: 11365 ExecStart=/bin/bash /var/lib/ceph/cbd96647-70de-11ee-a34a-df5a876dc2c5/osd.1/unit.run (code=exited, status=1/FAILURE) Process: 11686 ExecStopPost=/bin/bash /var/lib/ceph/cbd96647-70de-11ee-a34a-df5a876dc2c5/osd.1/unit.poststop (code=exited, status=0/SUCCESS) Main PID: 11365 (code=exited, status=1/FAILURE) CPU: 209ms root@ceph03:~#
Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.475+0000 7f42fa572540 -1 ** ERROR: osd init failed: (13) Permission denied Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.475+0000 7f42fa572540 -1 osd.1 0 OSD:init: unable to mount object store Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.475+0000 7f42fa572540 -1 bdev(0x558d4c4c0000 /var/lib/ceph/osd/ceph-1/block) open open got: (13) Permission denied Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.475+0000 7f42fa572540 -1 bluestore(/var/lib/ceph/osd/ceph-1/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-1/block: (13) Permission denied Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.475+0000 7f42fa572540 -1 bluestore(/var/lib/ceph/osd/ceph-1/block) _read_bdev_label failed to open /var/lib/ceph/osd/ceph-1/block: (13) Permission denied Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.475+0000 7f42fa572540 -1 bdev(0x558d4c4c0000 /var/lib/ceph/osd/ceph-1/block) open open got: (13) Permission denied Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.475+0000 7f42fa572540 -1 bdev(0x558d4c4c0000 /var/lib/ceph/osd/ceph-1/block) open open got: (13) Permission denied Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.475+0000 7f42fa572540 -1 bdev(0x558d4c4c0000 /var/lib/ceph/osd/ceph-1/block) open open got: (13) Permission denied Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.475+0000 7f42fa572540 -1 bdev(0x558d4c4c0000 /var/lib/ceph/osd/ceph-1/block) open open got: (13) Permission denied Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.475+0000 7f42fa572540 -1 bdev(0x558d4c4c0000 /var/lib/ceph/osd/ceph-1/block) open open got: (13) Permission denied Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.471+0000 7f42fa572540 -1 bdev(0x558d4c4c0000 /var/lib/ceph/osd/ceph-1/block) open open got: (13) Permission denied Oct 22 14:46:36 ceph03 bash[14942]: debug 2023-10-22T14:46:36.467+0000 7f42fa572540 -1 Falling back to public interface Oct 22 14:46:35 ceph03 systemd[1]: Started libcontainer container 615250af234a698a87c66ef3391429d57590ffb368adbb6072c5a165efb96c0d. Oct 22 14:46:35 ceph03 systemd[1]: var-lib-docker-overlay2-2c2e759d559a2f0f4023c83f0ff3712784b361af6d8abd286235cb5e37df39da-merged.mount: Deactivated successfully.
/var/lib/ceph was owned by UID 617 (that is ceph UID in containers) , but ceph UID on system was different (I think 64045)
So I stopped all ceph processes, changed system ceph UID and GID to 617. And started ceph. Now OSD service started.
No data to display