Project

General

Profile

Actions

Bug #56975

open

Issue with creating of OSDs, stuck on cmd `ceph --cluster ceph --name client.bootstrap-osd ...`

Added by Pavel Hladik over 1 year ago. Updated over 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
5 - suggestion
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi great CEPH people!

I'm not sure if it is a bug or a newbie issue.

After the successful install of latest CEPH cluster with `cephadm` tool, I've no problem with new OSDs on virtual machines disks, but when I want the same on hardware machine disks I'm in stuck with creating of OSDs.

My doing:
  • `ceph orch device zap cephn1 /dev/sd[b-x] --force`
  • I'm able to see all disks in `AVAILABLE` state with cmd `ceph orch device ls --wide`
  • when I run cmd `ceph orch apply osd --all-available-devices`, I see the correct container is created on node `cephn1`
  • that container is stuck on cmd `/usr/libexec/platform-python -s /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 42f400bf-986a-4ebd-ade2-3fa21e1f8736` and I'm not able continue, the OSDs are not created
  • I'll paste the error from `Dashboard / Services / osd.all-available-devices` from another trying, the error is shown after a couple of hours:

```
Failed to apply: cephadm exited with an error code: 1, stderr:Non-zero exit code 1 from /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd -e NODE_NAME=cephn1 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=all-available-devices -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/run/ceph:z -v /var/log/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/log/ceph:z -v /var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp_ldbc3lr:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpylzof8xb:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd lvm batch --no-auto /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx --yes --no-systemd /usr/bin/podman: stderr -> passed data devices: 23 physical, 0 LVM /usr/bin/podman: stderr --> relative data size: 1.0 /usr/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key /usr/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new f7382141-b199-41ff-9b4d-b6782f133a9c /usr/bin/podman: stderr stderr: Traceback (most recent call last): /usr/bin/podman: stderr stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1624, in send_command /usr/bin/podman: stderr stderr: cluster.mon_command, cmd, inbuf, timeout=timeout) /usr/bin/podman: stderr stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1536, in run_in_thread /usr/bin/podman: stderr stderr: raise Exception("timed out") /usr/bin/podman: stderr stderr: Exception: timed out /usr/bin/podman: stderr stderr: During handling of the above exception, another exception occurred: /usr/bin/podman: stderr stderr: Traceback (most recent call last): /usr/bin/podman: stderr stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1699, in json_command /usr/bin/podman: stderr stderr: inbuf, timeout, verbose) /usr/bin/podman: stderr stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1546, in send_command_retry /usr/bin/podman: stderr stderr: return send_command(args, **kwargs) /usr/bin/podman: stderr stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1650, in send_command /usr/bin/podman: stderr stderr: raise RuntimeError('"{0}": exception {1}'.format(cmd, e)) /usr/bin/podman: stderr stderr: RuntimeError: "{"prefix": "get_command_descriptions"}": exception timed out /usr/bin/podman: stderr stderr: During handling of the above exception, another exception occurred: /usr/bin/podman: stderr stderr: Traceback (most recent call last): /usr/bin/podman: stderr stderr: File "/usr/bin/ceph", line 1326, in <module> /usr/bin/podman: stderr stderr: retval = main() /usr/bin/podman: stderr stderr: File "/usr/bin/ceph", line 1238, in main /usr/bin/podman: stderr stderr: prefix='get_command_descriptions') /usr/bin/podman: stderr stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1703, in json_command /usr/bin/podman: stderr stderr: raise RuntimeError('"{0}": exception {1}'.format(argdict, e)) /usr/bin/podman: stderr stderr: RuntimeError: "None": exception "{"prefix": "get_command_descriptions"}": exception timed out /usr/bin/podman: stderr Traceback (most recent call last): /usr/bin/podman: stderr File "/usr/sbin/ceph-volume", line 11, in <module> /usr/bin/podman: stderr load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')() /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in init /usr/bin/podman: stderr self.main(self.argv) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc /usr/bin/podman: stderr return f(*a, **kw) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main /usr/bin/podman: stderr terminal.dispatch(self.mapper, subcommand_args) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch /usr/bin/podman: stderr instance.main() /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 46, in main /usr/bin/podman: stderr terminal.dispatch(self.mapper, self.argv) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch /usr/bin/podman: stderr instance.main() /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root /usr/bin/podman: stderr return func(*a, **kw) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 444, in main /usr/bin/podman: stderr self._execute(plan) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 463, in _execute /usr/bin/podman: stderr c.create(argparse.Namespace(*args)) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root /usr/bin/podman: stderr return func(*a, **kw) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py", line 26, in create /usr/bin/podman: stderr prepare_step.safe_prepare(args) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/prepare.py", line 252, in safe_prepare /usr/bin/podman: stderr self.prepare() /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root /usr/bin/podman: stderr return func(*a, **kw) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/prepare.py", line 292, in prepare /usr/bin/podman: stderr self.osd_id = prepare_utils.create_id(osd_fsid, json.dumps(secrets), osd_id=self.args.osd_id) /usr/bin/podman: stderr File "/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line 173, in create_id /usr/bin/podman: stderr raise RuntimeError('Unable to create a new OSD id') /usr/bin/podman: stderr RuntimeError: Unable to create a new OSD id Traceback (most recent call last): File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 9281, in <module> main() File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 9269, in main r = ctx.func(ctx) File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 2034, in _infer_config return func(ctx) File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 1950, in _infer_fsid return func(ctx) File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 2062, in _infer_image return func(ctx) File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 1937, in _validate_fsid return func(ctx) File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 5995, in command_ceph_volume out, err, code = call_throws(ctx, c.run_cmd()) File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 1739, in call_throws raise RuntimeError('Failed command: %s' % ' '.join(command)) RuntimeError: Failed command: /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd -e NODE_NAME=cephn1 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=all-available-devices -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/run/ceph:z -v /var/log/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/log/ceph:z -v /var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp_ldbc3lr:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpylzof8xb:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd lvm batch --no-auto /dev/sdb /dev/sdc /dev/sdd /dev/sde /dev/sdf /dev/sdg /dev/sdh /dev/sdi /dev/sdj /dev/sdk /dev/sdl /dev/sdm /dev/sdn /dev/sdo /dev/sdp /dev/sdq /dev/sdr /dev/sds /dev/sdt /dev/sdu /dev/sdv /dev/sdw /dev/sdx --yes --no-systemd
```

The problem is somewhere between my virtual machine and hardware machine. I use the same latest AlmaLinux and same setup with my ansible configs. I already checked if problem isn't in networking, but it looks fine. The behavior is the same with `podman` or `docker`.I already tried to use just `/dev/sdb` disk but result was the same.

I also tried to create OSD manually with this guide: https://docs.ceph.com/en/latest/install/manual-deployment/#long-form, and I got stuck too.

Output from `ceph orch device ls --wide` cmd is:
```
HOST PATH TYPE TRANSPORT RPM DEVICE ID SIZE HEALTH IDENT FAULT AVAILABLE REFRESHED REJECT REASONS
cephn1 /dev/sdb hdd HGST_HUS726040AL_K3GL95TB 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdc hdd HGST_HUS726040AL_K3GMHVAB 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdd hdd HGST_HUS726040AL_K3GMHWGB 4000G N/A N/A Yes 54s ago
cephn1 /dev/sde hdd HGST_HUS726040AL_K3GL95DB 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdf hdd HGST_HUS726040AL_K3GMHVNB 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdg hdd HGST_HUS726040AL_NHGNPKMK 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdh hdd HGST_HUS726040AL_NHGRK8PY 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdi hdd HGST_HUS726040AL_NHGRD46Y 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdj hdd HGST_HUS726040AL_NHGN8JTK 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdk hdd HGST_HUS726040AL_K3GLV8LL 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdl hdd HGST_HUS726040AL_NHGN8K1K 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdm hdd HGST_HUS726040AL_NHGRTT8Y 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdn hdd HGST_HUS726040AL_K3GMHVJB 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdo hdd HGST_HUS726040AL_NHGRXVXY 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdp hdd HGST_HUS726040AL_K3GMHW6B 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdq hdd HGST_HUS726040AL_K3GPSNUL 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdr hdd HGST_HUS726040AL_NHGN89DK 4000G N/A N/A Yes 54s ago
cephn1 /dev/sds hdd HGST_HUS726040AL_K3GMNV9L 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdt hdd HGST_HUS726040AL_K3GPSNVL 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdu hdd HGST_HUS726040AL_NHGNSJRY 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdv hdd HGST_HUS726040AL_K3GMHVVB 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdw hdd HGST_HUS726040AL_K3GMHV7B 4000G N/A N/A Yes 54s ago
cephn1 /dev/sdx hdd HGST_HUS726040AL_NHGRD1SY 4000G N/A N/A Yes 54s ago
```

I'm using:
  • almalinux 8.6
  • podman 4.0.2 (I know from docs that latest should be 3.0, but I already tried docker and the result was same)
  • python 3.6.8
  • chrony 4.1
  • lvm2 2.03.14

Do you have any suggestions what to check on hardware machine, please?

Actions #1

Updated by Pavel Hladik over 1 year ago

Error in human readable format:

cephadm exited with an error code: 1, stderr:Non-zero exit code 1 from /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd -e NODE_NAME=cephn1 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/run/ceph:z -v /var/log/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/log/ceph:z -v /var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp_f_e9s2z:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpqlld0i1b:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd lvm batch --no-auto /dev/sdb --yes --no-systemd
/usr/bin/podman: stderr --> passed data devices: 1 physical, 0 LVM
/usr/bin/podman: stderr --> relative data size: 1.0
/usr/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
/usr/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 97ffa914-db2c-4b6f-8f42-c74960fa9575
/usr/bin/podman: stderr  stderr: Traceback (most recent call last):
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1624, in send_command
/usr/bin/podman: stderr  stderr: cluster.mon_command, cmd, inbuf, timeout=timeout)
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1536, in run_in_thread
/usr/bin/podman: stderr  stderr: raise Exception("timed out")
/usr/bin/podman: stderr  stderr: Exception: timed out
/usr/bin/podman: stderr  stderr: During handling of the above exception, another exception occurred:
/usr/bin/podman: stderr  stderr: Traceback (most recent call last):
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1699, in json_command
/usr/bin/podman: stderr  stderr: inbuf, timeout, verbose)
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1546, in send_command_retry
/usr/bin/podman: stderr  stderr: return send_command(*args, **kwargs)
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1650, in send_command
/usr/bin/podman: stderr  stderr: raise RuntimeError('"{0}": exception {1}'.format(cmd, e))
/usr/bin/podman: stderr  stderr: RuntimeError: "{"prefix": "get_command_descriptions"}": exception timed out
/usr/bin/podman: stderr  stderr: During handling of the above exception, another exception occurred:
/usr/bin/podman: stderr  stderr: Traceback (most recent call last):
/usr/bin/podman: stderr  stderr: File "/usr/bin/ceph", line 1326, in <module>
/usr/bin/podman: stderr  stderr: retval = main()
/usr/bin/podman: stderr  stderr: File "/usr/bin/ceph", line 1238, in main
/usr/bin/podman: stderr  stderr: prefix='get_command_descriptions')
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1703, in json_command
/usr/bin/podman: stderr  stderr: raise RuntimeError('"{0}": exception {1}'.format(argdict, e))
/usr/bin/podman: stderr  stderr: RuntimeError: "None": exception "{"prefix": "get_command_descriptions"}": exception timed out
/usr/bin/podman: stderr Traceback (most recent call last):
/usr/bin/podman: stderr   File "/usr/sbin/ceph-volume", line 11, in <module>
/usr/bin/podman: stderr     load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__
/usr/bin/podman: stderr     self.main(self.argv)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
/usr/bin/podman: stderr     return f(*a, **kw)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main
/usr/bin/podman: stderr     terminal.dispatch(self.mapper, subcommand_args)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
/usr/bin/podman: stderr     instance.main()
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 46, in main
/usr/bin/podman: stderr     terminal.dispatch(self.mapper, self.argv)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
/usr/bin/podman: stderr     instance.main()
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
/usr/bin/podman: stderr     return func(*a, **kw)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 444, in main
/usr/bin/podman: stderr     self._execute(plan)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 463, in _execute
/usr/bin/podman: stderr     c.create(argparse.Namespace(**args))
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
/usr/bin/podman: stderr     return func(*a, **kw)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py", line 26, in create
/usr/bin/podman: stderr     prepare_step.safe_prepare(args)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/prepare.py", line 252, in safe_prepare
/usr/bin/podman: stderr     self.prepare()
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
/usr/bin/podman: stderr     return func(*a, **kw)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/prepare.py", line 292, in prepare
/usr/bin/podman: stderr     self.osd_id = prepare_utils.create_id(osd_fsid, json.dumps(secrets), osd_id=self.args.osd_id)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line 173, in create_id
/usr/bin/podman: stderr     raise RuntimeError('Unable to create a new OSD id')
/usr/bin/podman: stderr RuntimeError: Unable to create a new OSD id
Traceback (most recent call last):
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 9281, in <module>
main()
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 9269, in main
r = ctx.func(ctx)
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 2034, in _infer_config
return func(ctx)
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 1950, in _infer_fsid
return func(ctx)
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 2062, in _infer_image
return func(ctx)
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 1937, in _validate_fsid
return func(ctx)
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 5995, in command_ceph_volume
out, err, code = call_throws(ctx, c.run_cmd())
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 1739, in call_throws
raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd -e NODE_NAME=cephn1 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/run/ceph:z -v /var/log/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/log/ceph:z -v /var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp_f_e9s2z:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpqlld0i1b:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd lvm batch --no-auto /dev/sdb --yes --no-systemd
Traceback (most recent call last):
File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 125, in wrapper
return OrchResult(f(*args, **kwargs))
File "/usr/share/ceph/mgr/cephadm/module.py", line 2256, in create_osds
return self.osd_service.create_from_spec(drive_group)
File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 77, in create_from_spec
ret = self.mgr.wait_async(all_hosts())
File "/usr/share/ceph/mgr/cephadm/module.py", line 590, in wait_async
return self.event_loop.get_result(coro)
File "/usr/share/ceph/mgr/cephadm/ssh.py", line 48, in get_result
return asyncio.run_coroutine_threadsafe(coro, self._loop).result()
File "/lib64/python3.6/concurrent/futures/_base.py", line 432, in result
return self.__get_result()
File "/lib64/python3.6/concurrent/futures/_base.py", line 384, in __get_result
raise self._exception
File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 75, in all_hosts
return await gather(*futures)
File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 64, in create_from_spec_one
replace_osd_ids=osd_id_claims_for_host, env_vars=env_vars
File "/usr/share/ceph/mgr/cephadm/services/osd.py", line 95, in create_single_host
code, '\n'.join(err)))
RuntimeError: cephadm exited with an error code: 1, stderr:Non-zero exit code 1 from /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd -e NODE_NAME=cephn1 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/run/ceph:z -v /var/log/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/log/ceph:z -v /var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp_f_e9s2z:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpqlld0i1b:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd lvm batch --no-auto /dev/sdb --yes --no-systemd
/usr/bin/podman: stderr --> passed data devices: 1 physical, 0 LVM
/usr/bin/podman: stderr --> relative data size: 1.0
/usr/bin/podman: stderr Running command: /usr/bin/ceph-authtool --gen-print-key
/usr/bin/podman: stderr Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 97ffa914-db2c-4b6f-8f42-c74960fa9575
/usr/bin/podman: stderr  stderr: Traceback (most recent call last):
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1624, in send_command
/usr/bin/podman: stderr  stderr: cluster.mon_command, cmd, inbuf, timeout=timeout)
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1536, in run_in_thread
/usr/bin/podman: stderr  stderr: raise Exception("timed out")
/usr/bin/podman: stderr  stderr: Exception: timed out
/usr/bin/podman: stderr  stderr: During handling of the above exception, another exception occurred:
/usr/bin/podman: stderr  stderr: Traceback (most recent call last):
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1699, in json_command
/usr/bin/podman: stderr  stderr: inbuf, timeout, verbose)
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1546, in send_command_retry
/usr/bin/podman: stderr  stderr: return send_command(*args, **kwargs)
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1650, in send_command
/usr/bin/podman: stderr  stderr: raise RuntimeError('"{0}": exception {1}'.format(cmd, e))
/usr/bin/podman: stderr  stderr: RuntimeError: "{"prefix": "get_command_descriptions"}": exception timed out
/usr/bin/podman: stderr  stderr: During handling of the above exception, another exception occurred:
/usr/bin/podman: stderr  stderr: Traceback (most recent call last):
/usr/bin/podman: stderr  stderr: File "/usr/bin/ceph", line 1326, in <module>
/usr/bin/podman: stderr  stderr: retval = main()
/usr/bin/podman: stderr  stderr: File "/usr/bin/ceph", line 1238, in main
/usr/bin/podman: stderr  stderr: prefix='get_command_descriptions')
/usr/bin/podman: stderr  stderr: File "/usr/lib/python3.6/site-packages/ceph_argparse.py", line 1703, in json_command
/usr/bin/podman: stderr  stderr: raise RuntimeError('"{0}": exception {1}'.format(argdict, e))
/usr/bin/podman: stderr  stderr: RuntimeError: "None": exception "{"prefix": "get_command_descriptions"}": exception timed out
/usr/bin/podman: stderr Traceback (most recent call last):
/usr/bin/podman: stderr   File "/usr/sbin/ceph-volume", line 11, in <module>
/usr/bin/podman: stderr     load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 41, in __init__
/usr/bin/podman: stderr     self.main(self.argv)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
/usr/bin/podman: stderr     return f(*a, **kw)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 153, in main
/usr/bin/podman: stderr     terminal.dispatch(self.mapper, subcommand_args)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
/usr/bin/podman: stderr     instance.main()
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 46, in main
/usr/bin/podman: stderr     terminal.dispatch(self.mapper, self.argv)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
/usr/bin/podman: stderr     instance.main()
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
/usr/bin/podman: stderr     return func(*a, **kw)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 444, in main
/usr/bin/podman: stderr     self._execute(plan)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 463, in _execute
/usr/bin/podman: stderr     c.create(argparse.Namespace(**args))
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
/usr/bin/podman: stderr     return func(*a, **kw)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py", line 26, in create
/usr/bin/podman: stderr     prepare_step.safe_prepare(args)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/prepare.py", line 252, in safe_prepare
/usr/bin/podman: stderr     self.prepare()
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
/usr/bin/podman: stderr     return func(*a, **kw)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/prepare.py", line 292, in prepare
/usr/bin/podman: stderr     self.osd_id = prepare_utils.create_id(osd_fsid, json.dumps(secrets), osd_id=self.args.osd_id)
/usr/bin/podman: stderr   File "/usr/lib/python3.6/site-packages/ceph_volume/util/prepare.py", line 173, in create_id
/usr/bin/podman: stderr     raise RuntimeError('Unable to create a new OSD id')
/usr/bin/podman: stderr RuntimeError: Unable to create a new OSD id
Traceback (most recent call last):
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 9281, in <module>
main()
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 9269, in main
r = ctx.func(ctx)
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 2034, in _infer_config
return func(ctx)
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 1950, in _infer_fsid
return func(ctx)
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 2062, in _infer_image
return func(ctx)
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 1937, in _validate_fsid
return func(ctx)
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 5995, in command_ceph_volume
out, err, code = call_throws(ctx, c.run_cmd())
File "/var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/cephadm.9f5947044415209b488d90a5f9b9807235c9c2dc29836a12b73019a60ec20bde", line 1739, in call_throws
raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd -e NODE_NAME=cephn1 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=None -e CEPH_VOLUME_SKIP_RESTORECON=yes -e CEPH_VOLUME_DEBUG=1 -v /var/run/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/run/ceph:z -v /var/log/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f:/var/log/ceph:z -v /var/lib/ceph/f67fddc2-0d83-11ed-97ec-b6a268deb32f/crash:/var/lib/ceph/crash:z -v /run/systemd/journal:/run/systemd/journal -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /:/rootfs -v /tmp/ceph-tmp_f_e9s2z:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpqlld0i1b:/var/lib/ceph/bootstrap-osd/ceph.keyring:z quay.io/ceph/ceph@sha256:1ed3a9d5b1ba45232047ed4b8ce7e265f2413f98eb27ab72c8acab428921d7cd lvm batch --no-auto /dev/sdb --yes --no-systemd

Actions #2

Updated by Pavel Hladik over 1 year ago

Update:
The problem isn't just between virtual machine and hardware machine, but it is related somehow where my testing virtual machines are running. If all master and nodes are running in one hypervizor, everything is fine, but when I move the node to another hypervisor I hit the error. It seems like a network issue, but all ports/services are reachable. I tried to completely disable firewalld on virtual machines but with no luck.

Actions #3

Updated by Pavel Hladik over 1 year ago

Please close the ticket, I've found the issue.

Actions

Also available in: Atom PDF