Project

General

Profile

Documentation #45383

Cephadm.py OSD deployment fails: full device path or just the name?

Added by Georgios Kyratsas almost 4 years ago. Updated about 3 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
cephadm
Target version:
% Done:

0%

Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

OSD deployment on cephadm.py fails on my local teuthology server due to not failing to recognize the device. When I just reverted commit f026a1c it works fine for me so could someone clarify if the syntax with short name should work or not.

2020-05-04T09:50:46.693 INFO:tasks.cephadm:Deploying osd.0 on target-geky-069 with /dev/vde...
2020-05-04T09:50:46.694 INFO:teuthology.orchestra.run.target-geky-069:> sudo cephadm --image registry.suse.de/devel/storage/7.0/cr/containers/ses/7/ceph/ceph shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid ee0a3b12-8dea-11ea-9277-fa163e22acf0 -- ceph-volume lvm zap /dev/vde
2020-05-04T09:50:48.303 INFO:teuthology.orchestra.run.target-geky-069.stderr:--> Zapping: /dev/vde
2020-05-04T09:50:48.303 INFO:teuthology.orchestra.run.target-geky-069.stderr:--> --destroy was not specified, but zapping a whole device will remove the partition table
2020-05-04T09:50:48.304 INFO:teuthology.orchestra.run.target-geky-069.stderr:Running command: /usr/bin/dd if=/dev/zero of=/dev/vde bs=1M count=10 conv=fsync
2020-05-04T09:50:48.304 INFO:teuthology.orchestra.run.target-geky-069.stderr: stderr: 10+0 records in
2020-05-04T09:50:48.305 INFO:teuthology.orchestra.run.target-geky-069.stderr:10+0 records out
2020-05-04T09:50:48.306 INFO:teuthology.orchestra.run.target-geky-069.stderr: stderr: 10485760 bytes (10 MB, 10 MiB) copied, 0.0997165 s, 105 MB/s
2020-05-04T09:50:48.307 INFO:teuthology.orchestra.run.target-geky-069.stderr:--> Zapping successful for: <Raw Device: /dev/vde>
2020-05-04T09:50:48.437 INFO:teuthology.orchestra.run.target-geky-069:> sudo cephadm --image registry.suse.de/devel/storage/7.0/cr/containers/ses/7/ceph/ceph shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid ee0a3b12-8dea-11ea-9277-fa163e22acf0 -- ceph orch daemon add osd target-geky-069:vde
2020-05-04T09:50:51.597 INFO:teuthology.orchestra.run.target-geky-069.stderr:Error EINVAL: Traceback (most recent call last):
2020-05-04T09:50:51.598 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command
2020-05-04T09:50:51.598 INFO:teuthology.orchestra.run.target-geky-069.stderr:    return self.handle_command(inbuf, cmd)
2020-05-04T09:50:51.599 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command
2020-05-04T09:50:51.599 INFO:teuthology.orchestra.run.target-geky-069.stderr:    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
2020-05-04T09:50:51.600 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call
2020-05-04T09:50:51.600 INFO:teuthology.orchestra.run.target-geky-069.stderr:    return self.func(mgr, **kwargs)
2020-05-04T09:50:51.601 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda>
2020-05-04T09:50:51.602 INFO:teuthology.orchestra.run.target-geky-069.stderr:    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
2020-05-04T09:50:51.602 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper
2020-05-04T09:50:51.603 INFO:teuthology.orchestra.run.target-geky-069.stderr:    return func(*args, **kwargs)
2020-05-04T09:50:51.603 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/orchestrator/module.py", line 597, in _daemon_add_osd
2020-05-04T09:50:51.604 INFO:teuthology.orchestra.run.target-geky-069.stderr:    completion = self.create_osds(drive_group)
2020-05-04T09:50:51.604 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1542, in inner
2020-05-04T09:50:51.604 INFO:teuthology.orchestra.run.target-geky-069.stderr:    completion = self._oremote(method_name, args, kwargs)
2020-05-04T09:50:51.605 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1614, in _oremote
2020-05-04T09:50:51.606 INFO:teuthology.orchestra.run.target-geky-069.stderr:    return mgr.remote(o, meth, *args, **kwargs)
2020-05-04T09:50:51.607 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/mgr_module.py", line 1515, in remote
2020-05-04T09:50:51.607 INFO:teuthology.orchestra.run.target-geky-069.stderr:    args, kwargs)
2020-05-04T09:50:51.607 INFO:teuthology.orchestra.run.target-geky-069.stderr:RuntimeError: Remote method threw exception: Traceback (most recent call last):
2020-05-04T09:50:51.608 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/cephadm/module.py", line 559, in wrapper
2020-05-04T09:50:51.608 INFO:teuthology.orchestra.run.target-geky-069.stderr:    return AsyncCompletion(value=f(*args, **kwargs), name=f.__name__)
2020-05-04T09:50:51.608 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/cephadm/module.py", line 2142, in create_osds
2020-05-04T09:50:51.609 INFO:teuthology.orchestra.run.target-geky-069.stderr:    replace_osd_ids=drive_group.osd_id_claims.get(host, []))
2020-05-04T09:50:51.609 INFO:teuthology.orchestra.run.target-geky-069.stderr:  File "/usr/share/ceph/mgr/cephadm/module.py", line 2248, in _create_osd
2020-05-04T09:50:51.609 INFO:teuthology.orchestra.run.target-geky-069.stderr:    code, '\n'.join(err)))
2020-05-04T09:50:51.610 INFO:teuthology.orchestra.run.target-geky-069.stderr:RuntimeError: cephadm exited with an error code: 1, stderr:INFO:cephadm:/usr/bin/podman:stderr  stderr: lsblk: vde: not a block device
2020-05-04T09:50:51.610 INFO:teuthology.orchestra.run.target-geky-069.stderr:INFO:cephadm:/usr/bin/podman:stderr  stderr: blkid: error: vde: No such file or directory
2020-05-04T09:50:51.611 INFO:teuthology.orchestra.run.target-geky-069.stderr:INFO:cephadm:/usr/bin/podman:stderr  stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.

Traceback is:

Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command
    return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command
    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda>
    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/module.py", line 597, in _daemon_add_osd
    completion = self.create_osds(drive_group)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1542, in inner
    completion = self._oremote(method_name, args, kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1614, in _oremote
    return mgr.remote(o, meth, *args, **kwargs)
  File "/usr/share/ceph/mgr/mgr_module.py", line 1515, in remote
    args, kwargs)
RuntimeError: Remote method threw exception: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 559, in wrapper
    return AsyncCompletion(value=f(*args, **kwargs), name=f.__name__)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2142, in create_osds
    replace_osd_ids=drive_group.osd_id_claims.get(host, []))
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2248, in _create_osd
    code, '\n'.join(err)))
RuntimeError: cephadm exited with an error code: 1, stderr:INFO:cephadm:/usr/bin/podman:stderr  stderr: lsblk: vde: not a block device

History

#1 Updated by Sebastian Wagner almost 4 years ago

  • Description updated (diff)

Looks like you called

ceph orch daemon add osd ... 

on a host that doesn't have a device named vde

#2 Updated by Georgios Kyratsas almost 4 years ago

There was device /dev/vde and it worked when I dropped the shortname and just use the whole path (https://github.com/ceph/ceph/blob/94e39db4e803cf77a03190957871a213c9ced2ad/qa/tasks/cephadm.py#L598). So since cephadm is expecting the whole device path I don't get how this syntax is working. In the docs also there is the whole path after the hostname https://docs.ceph.com/docs/master/////cephadm/install/#deploy-osds

#3 Updated by Sebastian Wagner almost 4 years ago

  • Tracker changed from Bug to Documentation

#4 Updated by Georgios Kyratsas almost 4 years ago

I triggered a run with `sleep-before-teardown' to make it more clear.

target-geky-015:/home/ubuntu # lsblk
NAME                                                                                                  MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
vda                                                                                                   254:0    0   80G  0 disk 
├─vda1                                                                                                254:1    0    2M  0 part 
├─vda2                                                                                                254:2    0  512M  0 part /boot/efi
└─vda3                                                                                                254:3    0 79.5G  0 part /
vdb                                                                                                   254:16   0   10G  0 disk 
vdc                                                                                                   254:32   0   10G  0 disk 
vdd                                                                                                   254:48   0   10G  0 disk 
vde                                                                                                   254:64   0   10G  0 disk 
└─ceph--efbac9cf--33ab--48b2--aca5--a5d390e12d6e-osd--block--4fb3e004--0e51--4e43--bffa--6f49db8c9f14 253:0    0   10G  0 lvm  
target-geky-015:/home/ubuntu # sudo cephadm --image registry.suse.de/devel/storage/7.0/cr/containers/ses/7/ceph/ceph shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 93d2fd46-8ed5-11ea-a353-fa163e22acf0 -- ceph orch daemon add osd target-geky-015:vdb
WARNING: The same type, major and minor should not be used for multiple devices.
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command
    return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command
    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda>
    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/module.py", line 597, in _daemon_add_osd
    completion = self.create_osds(drive_group)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1542, in inner
    completion = self._oremote(method_name, args, kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 1614, in _oremote
    return mgr.remote(o, meth, *args, **kwargs)
  File "/usr/share/ceph/mgr/mgr_module.py", line 1515, in remote
    args, kwargs)
RuntimeError: Remote method threw exception: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 559, in wrapper
    return AsyncCompletion(value=f(*args, **kwargs), name=f.__name__)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2142, in create_osds
    replace_osd_ids=drive_group.osd_id_claims.get(host, []))
  File "/usr/share/ceph/mgr/cephadm/module.py", line 2248, in _create_osd
    code, '\n'.join(err)))
RuntimeError: cephadm exited with an error code: 1, stderr:INFO:cephadm:/usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices.
INFO:cephadm:/usr/bin/podman:stderr  stderr: lsblk: vdb: not a block device
INFO:cephadm:/usr/bin/podman:stderr  stderr: blkid: error: vdb: No such file or directory
INFO:cephadm:/usr/bin/podman:stderr  stderr: Unknown device, --name=, --path=, or absolute path in /dev/ or /sys expected.
INFO:cephadm:/usr/bin/podman:stderr usage: ceph-volume lvm prepare [-h] --data DATA [--data-size DATA_SIZE]
INFO:cephadm:/usr/bin/podman:stderr                                [--data-slots DATA_SLOTS] [--filestore]
INFO:cephadm:/usr/bin/podman:stderr                                [--journal JOURNAL]
INFO:cephadm:/usr/bin/podman:stderr                                [--journal-size JOURNAL_SIZE] [--bluestore]
INFO:cephadm:/usr/bin/podman:stderr                                [--block.db BLOCK_DB]
INFO:cephadm:/usr/bin/podman:stderr                                [--block.db-size BLOCK_DB_SIZE]
INFO:cephadm:/usr/bin/podman:stderr                                [--block.db-slots BLOCK_DB_SLOTS]
INFO:cephadm:/usr/bin/podman:stderr                                [--block.wal BLOCK_WAL]
INFO:cephadm:/usr/bin/podman:stderr                                [--block.wal-size BLOCK_WAL_SIZE]
INFO:cephadm:/usr/bin/podman:stderr                                [--block.wal-slots BLOCK_WAL_SLOTS]
INFO:cephadm:/usr/bin/podman:stderr                                [--osd-id OSD_ID] [--osd-fsid OSD_FSID]
INFO:cephadm:/usr/bin/podman:stderr                                [--cluster-fsid CLUSTER_FSID]
INFO:cephadm:/usr/bin/podman:stderr                                [--crush-device-class CRUSH_DEVICE_CLASS]
INFO:cephadm:/usr/bin/podman:stderr                                [--dmcrypt] [--no-systemd]
INFO:cephadm:/usr/bin/podman:stderr ceph-volume lvm prepare: error: Unable to proceed with non-existing device: vdb
target-geky-015:/home/ubuntu # cephadm --image registry.suse.de/devel/storage/7.0/cr/containers/ses/7/ceph/ceph shell -c /etc/ceph/ceph.conf -k /etc/ceph/ceph.client.admin.keyring --fsid 93d2fd46-8ed5-11ea-a353-fa163e22acf0 -- ceph orch daemon add osd target-geky-015:/dev/vdb
WARNING: The same type, major and minor should not be used for multiple devices.
Created osd(s) 3 on host 'target-geky-015'

#5 Updated by Sebastian Wagner almost 4 years ago

  • Subject changed from Cephadm.py OSD deployment fails to Cephadm.py OSD deployment fails: full device path or just the name?

#6 Updated by Joshua Schmid almost 4 years ago

Some background as to why this exists see (https://github.com/ceph/ceph/commit/f026a1c9f661fc1442048ef0bfadf84c35c14254)

It's due to the way ceph-volume's subcommands expect drives (details in the commit).

I think teuthology and ceph-volume need a bit of work to make things more robust.

ceph-volume:

  • standardize the accepted format for all subcommands (needs verification if that's still an issue)
    It should not matter if we pass in vg/lv or actual devices

teuthology:

  • We need to be able to test on devices and on logical volumes without snipping/shortening device paths etc.
  • Move to the agreed upon `ceph orch apply osd` syntax and use OSDSpecs for teuthology. This way we're way closer to an actual deployment.

#7 Updated by Georgios Kyratsas almost 4 years ago

The confusing part from my pov is that downstream this commit that strips the device name is breaking the tests and I cannot see why it was needed in the first place and why it's working upstream since 'ceph-volume lvm prepare' doesn't seem to work without the whole path.

#8 Updated by Zac Dover almost 4 years ago

I'd like some feedback from the community (at as many levels as possible) about whether I should add a note to the docs that echoes Sage's comment on this, which is:

Zap needs a full path, but create/prepare needs the VG/LV
only if it is an existing LV.
We'll make c-v more friendly later.

All comments are welcome.

#9 Updated by Zac Dover almost 4 years ago

  • Status changed from New to In Progress

#10 Updated by Georgios Kyratsas almost 4 years ago

So as Joshua pointed to me since I had completely missed it, the reason of my confusion was that on upstream teuthology setup there are precreated VGs/LVs which is different on our downstream one where we are using raw devices. If there is a VG/LV, as Sage points in his commit message, `ceph orch daemon add osd` command works with just the shortname but when using raw disks full path is needed. I guess it could be highlighted in the docs so people don't get confused like I did but not sure...

#11 Updated by Sebastian Wagner about 3 years ago

  • Status changed from In Progress to Can't reproduce

Also available in: Atom PDF