Documentation #45383
Cephadm.py OSD deployment fails: full device path or just the name?
0%
Description
OSD deployment on cephadm.py fails on my local teuthology server due to not failing to recognize the device. When I just reverted commit f026a1c it works fine for me so could someone clarify if the syntax with short name should work or not.
2020-05-04T09:50:46.693 INFO:tasks.cephadm:Deploying osd.0 on target-geky-069 with /dev/vde... 2020-05-04T09:50:46.694 INFO:teuthology.orchestra.run.target-geky-069:> sudo cephadm --image registry.suse.de/devel/storage/7.0/cr/containers/ses/7/ceph/ceph shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid ee0a3b12-8dea-11ea-9277-fa163e22acf0 -- ceph-volume lvm zap /dev/vde 2020-05-04T09:50:48.303 INFO:teuthology.orchestra.run.target-geky-069.stderr:--> Zapping: /dev/vde 2020-05-04T09:50:48.303 INFO:teuthology.orchestra.run.target-geky-069.stderr:--> --destroy was not specified, but zapping a whole device will remove the partition table 2020-05-04T09:50:48.304 INFO:teuthology.orchestra.run.target-geky-069.stderr:Running command: /usr/bin/dd if=/dev/zero of=/dev/vde bs=1M count=10 conv=fsync 2020-05-04T09:50:48.304 INFO:teuthology.orchestra.run.target-geky-069.stderr: stderr: 10+0 records in 2020-05-04T09:50:48.305 INFO:teuthology.orchestra.run.target-geky-069.stderr:10+0 records out 2020-05-04T09:50:48.306 INFO:teuthology.orchestra.run.target-geky-069.stderr: stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0997165 s, 105 MB/s 2020-05-04T09:50:48.307 INFO:teuthology.orchestra.run.target-geky-069.stderr:--> Zapping successful for: <Raw Device: /dev/vde> 2020-05-04T09:50:48.437 INFO:teuthology.orchestra.run.target-geky-069:> sudo cephadm --image registry.suse.de/devel/storage/7.0/cr/containers/ses/7/ceph/ceph shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid ee0a3b12-8dea-11ea-9277-fa163e22acf0 -- ceph orch daemon add osd target-geky-069:vde 2020-05-04T09:50:51.597 INFO:teuthology.orchestra.run.target-geky-069.stderr:Error EINVAL: Traceback (most recent call last): 2020-05-04T09:50:51.598 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command 2020-05-04T09:50:51.598 INFO:teuthology.orchestra.run.target-geky-069.stderr: return self.handle_command(inbuf, cmd) 2020-05-04T09:50:51.599 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command 2020-05-04T09:50:51.599 INFO:teuthology.orchestra.run.target-geky-069.stderr: return dispatch[cmd['prefix']].call(self, cmd, inbuf) 2020-05-04T09:50:51.600 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call 2020-05-04T09:50:51.600 INFO:teuthology.orchestra.run.target-geky-069.stderr: return self.func(mgr, **kwargs) 2020-05-04T09:50:51.601 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda> 2020-05-04T09:50:51.602 INFO:teuthology.orchestra.run.target-geky-069.stderr: wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) 2020-05-04T09:50:51.602 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper 2020-05-04T09:50:51.603 INFO:teuthology.orchestra.run.target-geky-069.stderr: return func(*args, **kwargs) 2020-05-04T09:50:51.603 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/orchestrator/module.py", line 597, in _daemon_add_osd 2020-05-04T09:50:51.604 INFO:teuthology.orchestra.run.target-geky-069.stderr: completion = self.create_osds(drive_group) 2020-05-04T09:50:51.604 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1542, in inner 2020-05-04T09:50:51.604 INFO:teuthology.orchestra.run.target-geky-069.stderr: completion = self._oremote(method_name, args, kwargs) 2020-05-04T09:50:51.605 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1614, in _oremote 2020-05-04T09:50:51.606 INFO:teuthology.orchestra.run.target-geky-069.stderr: return mgr.remote(o, meth, *args, **kwargs) 2020-05-04T09:50:51.607 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/mgr_module.py", line 1515, in remote 2020-05-04T09:50:51.607 INFO:teuthology.orchestra.run.target-geky-069.stderr: args, kwargs) 2020-05-04T09:50:51.607 INFO:teuthology.orchestra.run.target-geky-069.stderr:RuntimeError: Remote method threw exception: Traceback (most recent call last): 2020-05-04T09:50:51.608 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/cephadm/module.py", line 559, in wrapper 2020-05-04T09:50:51.608 INFO:teuthology.orchestra.run.target-geky-069.stderr: return AsyncCompletion(value=f(*args, **kwargs), name=f.__name__) 2020-05-04T09:50:51.608 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/cephadm/module.py", line 2142, in create_osds 2020-05-04T09:50:51.609 INFO:teuthology.orchestra.run.target-geky-069.stderr: replace_osd_ids=drive_group.osd_id_claims.get(host, [])) 2020-05-04T09:50:51.609 INFO:teuthology.orchestra.run.target-geky-069.stderr: File "/usr/share/ceph/mgr/cephadm/module.py", line 2248, in _create_osd 2020-05-04T09:50:51.609 INFO:teuthology.orchestra.run.target-geky-069.stderr: code, '\n'.join(err))) 2020-05-04T09:50:51.610 INFO:teuthology.orchestra.run.target-geky-069.stderr:RuntimeError: cephadm exited with an error code: 1, stderr:INFO:cephadm:/usr/bin/podman:stderr stderr: lsblk: vde: not a block device 2020-05-04T09:50:51.610 INFO:teuthology.orchestra.run.target-geky-069.stderr:INFO:cephadm:/usr/bin/podman:stderr stderr: blkid: error: vde: No such file or directory 2020-05-04T09:50:51.611 INFO:teuthology.orchestra.run.target-geky-069.stderr:INFO:cephadm:/usr/bin/podman:stderr stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
Traceback is:
Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command return self.handle_command(inbuf, cmd) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command return dispatch[cmd['prefix']].call(self, cmd, inbuf) File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call return self.func(mgr, **kwargs) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda> wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper return func(*args, **kwargs) File "/usr/share/ceph/mgr/orchestrator/module.py", line 597, in _daemon_add_osd completion = self.create_osds(drive_group) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1542, in inner completion = self._oremote(method_name, args, kwargs) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1614, in _oremote return mgr.remote(o, meth, *args, **kwargs) File "/usr/share/ceph/mgr/mgr_module.py", line 1515, in remote args, kwargs) RuntimeError: Remote method threw exception: Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/module.py", line 559, in wrapper return AsyncCompletion(value=f(*args, **kwargs), name=f.__name__) File "/usr/share/ceph/mgr/cephadm/module.py", line 2142, in create_osds replace_osd_ids=drive_group.osd_id_claims.get(host, [])) File "/usr/share/ceph/mgr/cephadm/module.py", line 2248, in _create_osd code, '\n'.join(err))) RuntimeError: cephadm exited with an error code: 1, stderr:INFO:cephadm:/usr/bin/podman:stderr stderr: lsblk: vde: not a block device
History
#1 Updated by Sebastian Wagner almost 4 years ago
- Description updated (diff)
Looks like you called
ceph orch daemon add osd ...
on a host that doesn't have a device named vde
#2 Updated by Georgios Kyratsas almost 4 years ago
There was device /dev/vde and it worked when I dropped the shortname and just use the whole path (https://github.com/ceph/ceph/blob/94e39db4e803cf77a03190957871a213c9ced2ad/qa/tasks/cephadm.py#L598). So since cephadm is expecting the whole device path I don't get how this syntax is working. In the docs also there is the whole path after the hostname https://docs.ceph.com/docs/master/////cephadm/install/#deploy-osds
#3 Updated by Sebastian Wagner almost 4 years ago
- Tracker changed from Bug to Documentation
#4 Updated by Georgios Kyratsas almost 4 years ago
I triggered a run with `sleep-before-teardown' to make it more clear.
target-geky-015:/home/ubuntu # lsblk NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 254:0 0 80G 0 disk ├─vda1 254:1 0 2M 0 part ├─vda2 254:2 0 512M 0 part /boot/efi └─vda3 254:3 0 79.5G 0 part / vdb 254:16 0 10G 0 disk vdc 254:32 0 10G 0 disk vdd 254:48 0 10G 0 disk vde 254:64 0 10G 0 disk └─ceph--efbac9cf--33ab--48b2--aca5--a5d390e12d6e-osd--block--4fb3e004--0e51--4e43--bffa--6f49db8c9f14 253:0 0 10G 0 lvm
target-geky-015:/home/ubuntu # sudo cephadm --image registry.suse.de/devel/storage/7.0/cr/containers/ses/7/ceph/ceph shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 93d2fd46-8ed5-11ea-a353-fa163e22acf0 -- ceph orch daemon add osd target-geky-015:vdb WARNING: The same type, major and minor should not be used for multiple devices. Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command return self.handle_command(inbuf, cmd) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command return dispatch[cmd['prefix']].call(self, cmd, inbuf) File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call return self.func(mgr, **kwargs) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda> wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper return func(*args, **kwargs) File "/usr/share/ceph/mgr/orchestrator/module.py", line 597, in _daemon_add_osd completion = self.create_osds(drive_group) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1542, in inner completion = self._oremote(method_name, args, kwargs) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1614, in _oremote return mgr.remote(o, meth, *args, **kwargs) File "/usr/share/ceph/mgr/mgr_module.py", line 1515, in remote args, kwargs) RuntimeError: Remote method threw exception: Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/module.py", line 559, in wrapper return AsyncCompletion(value=f(*args, **kwargs), name=f.__name__) File "/usr/share/ceph/mgr/cephadm/module.py", line 2142, in create_osds replace_osd_ids=drive_group.osd_id_claims.get(host, [])) File "/usr/share/ceph/mgr/cephadm/module.py", line 2248, in _create_osd code, '\n'.join(err))) RuntimeError: cephadm exited with an error code: 1, stderr:INFO:cephadm:/usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices. INFO:cephadm:/usr/bin/podman:stderr stderr: lsblk: vdb: not a block device INFO:cephadm:/usr/bin/podman:stderr stderr: blkid: error: vdb: No such file or directory INFO:cephadm:/usr/bin/podman:stderr stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected. INFO:cephadm:/usr/bin/podman:stderr usage: ceph-volume lvm prepare [-h] --data DATA [--data-size DATA_SIZE] INFO:cephadm:/usr/bin/podman:stderr [--data-slots DATA_SLOTS] [--filestore] INFO:cephadm:/usr/bin/podman:stderr [--journal JOURNAL] INFO:cephadm:/usr/bin/podman:stderr [--journal-size JOURNAL_SIZE] [--bluestore] INFO:cephadm:/usr/bin/podman:stderr [--block.db BLOCK_DB] INFO:cephadm:/usr/bin/podman:stderr [--block.db-size BLOCK_DB_SIZE] INFO:cephadm:/usr/bin/podman:stderr [--block.db-slots BLOCK_DB_SLOTS] INFO:cephadm:/usr/bin/podman:stderr [--block.wal BLOCK_WAL] INFO:cephadm:/usr/bin/podman:stderr [--block.wal-size BLOCK_WAL_SIZE] INFO:cephadm:/usr/bin/podman:stderr [--block.wal-slots BLOCK_WAL_SLOTS] INFO:cephadm:/usr/bin/podman:stderr [--osd-id OSD_ID] [--osd-fsid OSD_FSID] INFO:cephadm:/usr/bin/podman:stderr [--cluster-fsid CLUSTER_FSID] INFO:cephadm:/usr/bin/podman:stderr [--crush-device-class CRUSH_DEVICE_CLASS] INFO:cephadm:/usr/bin/podman:stderr [--dmcrypt] [--no-systemd] INFO:cephadm:/usr/bin/podman:stderr ceph-volume lvm prepare: error: Unable to proceed with non-existing device: vdb
target-geky-015:/home/ubuntu # cephadm --image registry.suse.de/devel/storage/7.0/cr/containers/ses/7/ceph/ceph shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 93d2fd46-8ed5-11ea-a353-fa163e22acf0 -- ceph orch daemon add osd target-geky-015:/dev/vdb WARNING: The same type, major and minor should not be used for multiple devices. Created osd(s) 3 on host 'target-geky-015'
#5 Updated by Sebastian Wagner almost 4 years ago
- Subject changed from Cephadm.py OSD deployment fails to Cephadm.py OSD deployment fails: full device path or just the name?
#6 Updated by Joshua Schmid almost 4 years ago
Some background as to why this exists see (https://github.com/ceph/ceph/commit/f026a1c9f661fc1442048ef0bfadf84c35c14254)
It's due to the way ceph-volume's subcommands expect drives (details in the commit).
I think teuthology and ceph-volume need a bit of work to make things more robust.
ceph-volume:
- standardize the accepted format for all subcommands (needs verification if that's still an issue)
It should not matter if we pass in vg/lv or actual devices
teuthology:
- We need to be able to test on devices and on logical volumes without snipping/shortening device paths etc.
- Move to the agreed upon `ceph orch apply osd` syntax and use OSDSpecs for teuthology. This way we're way closer to an actual deployment.
#7 Updated by Georgios Kyratsas almost 4 years ago
The confusing part from my pov is that downstream this commit that strips the device name is breaking the tests and I cannot see why it was needed in the first place and why it's working upstream since 'ceph-volume lvm prepare' doesn't seem to work without the whole path.
#8 Updated by Zac Dover almost 4 years ago
I'd like some feedback from the community (at as many levels as possible) about whether I should add a note to the docs that echoes Sage's comment on this, which is:
Zap needs a full path, but create/prepare needs the VG/LV
only if it is an existing LV.
We'll make c-v more friendly later.
All comments are welcome.
#9 Updated by Zac Dover almost 4 years ago
- Status changed from New to In Progress
#10 Updated by Georgios Kyratsas almost 4 years ago
So as Joshua pointed to me since I had completely missed it, the reason of my confusion was that on upstream teuthology setup there are precreated VGs/LVs which is different on our downstream one where we are using raw devices. If there is a VG/LV, as Sage points in his commit message, `ceph orch daemon add osd` command works with just the shortname but when using raw disks full path is needed. I guess it could be highlighted in the docs so people don't get confused like I did but not sure...
#11 Updated by Sebastian Wagner about 3 years ago
- Status changed from In Progress to Can't reproduce