Bug #44270
closedUnder certain circumstances, "ceph orch apply" returns success even when no OSDs are created
0%
Description
On a single-node cluster, the "cephadm bootstrap" command deploys 1 MGR and 1 MON.
On very recent versions of master, if one very quickly runs the following "ceph orch osd create" command after "cephadm bootstrap" finishes
echo '{"testing_dg_admin": {"host_pattern": "admin*", "data_devices": {"all": true}}}' | ceph orch apply -i -
the command will complete with status code 0, yet no OSDs get created!
(Note: this was taken from a system where the host was called "admin.octopus_test1.com". To reproduce, change "admin" to the short hostname of the host.)
If I insert a "sleep 60" between "cephadm bootstrap" and "ceph orch osd create", the OSDs get created according to the drive groups JSON provided.
Note: this behavior was introduced quite recently.
Taken from the workaround:¶
Upon closer examination, we can see that the following commands succeed without causing any OSDs to be deployed:
`echo {\"testing_dg_node2\": {\"host_pattern\": \"node2*\", \"data_devices\": {\"all\": true}}} | ceph orch osd create -i -`
This is because the orchestrator only knows about the drives on node1:
admin:~ # ceph orch device ls HOST PATH TYPE SIZE DEVICE AVAIL REJECT REASONS node1 /dev/vdb hdd 8192M 259451 True node1 /dev/vdc hdd 8192M 652460 True node1 /dev/vda hdd 42.0G False locked
Yet "cephadm ceph-volume inventory" sees the drives when run on node2:
node2:~ # cephadm ceph-volume inventory INFO:cephadm:Inferring fsid a581fad8-5ccb-11ea-966f-525400bb7fa5 INFO:cephadm:/usr/bin/podman:stdout INFO:cephadm:/usr/bin/podman:stdout Device Path Size rotates available Model name INFO:cephadm:/usr/bin/podman:stdout /dev/vdb 8.00 GB True True INFO:cephadm:/usr/bin/podman:stdout /dev/vdc 8.00 GB True True INFO:cephadm:/usr/bin/podman:stdout /dev/vda 42.00 GB True False Device Path Size rotates available Model name /dev/vdb 8.00 GB True True /dev/vdc 8.00 GB True True /dev/vda 42.00 GB True False