Project

General

Profile

Bug #53827

cephadm exited with error code when creating osd: Input/Output error. Faulty NVME?

Added by Sridhar Seshasayee about 2 years ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

/a/yuriw-2022-01-06_15:57:04-rados-wip-yuri6-testing-2022-01-05-1255-distro-default-smithi/6599513

Teuthology Log:

2022-01-07T02:41:59.965 INFO:teuthology.orchestra.run.smithi039.stderr:2022-01-07T02:41:59.964+0000 7f76bcff9700  1 -- 172.21.15.39:0/4064618770 <== mgr.14164 v2:172.21.15.39:6800/30744276 1 ==== mgr_command_reply(tid 0: -22 Traceback (most recent call last):
2022-01-07T02:41:59.966 INFO:teuthology.orchestra.run.smithi039.stderr:  File "/usr/share/ceph/mgr/mgr_module.py", line 1335, in _handle_command
2022-01-07T02:41:59.966 INFO:teuthology.orchestra.run.smithi039.stderr:    return self.handle_command(inbuf, cmd)
2022-01-07T02:41:59.967 INFO:teuthology.orchestra.run.smithi039.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 167, in handle_command
2022-01-07T02:41:59.967 INFO:teuthology.orchestra.run.smithi039.stderr:    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
2022-01-07T02:41:59.967 INFO:teuthology.orchestra.run.smithi039.stderr:  File "/usr/share/ceph/mgr/mgr_module.py", line 389, in call
2022-01-07T02:41:59.968 INFO:teuthology.orchestra.run.smithi039.stderr:    return self.func(mgr, **kwargs)
2022-01-07T02:41:59.968 INFO:teuthology.orchestra.run.smithi039.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in <lambda>
2022-01-07T02:41:59.968 INFO:teuthology.orchestra.run.smithi039.stderr:    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)  # noqa: E731
2022-01-07T02:41:59.968 INFO:teuthology.orchestra.run.smithi039.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper
2022-01-07T02:41:59.969 INFO:teuthology.orchestra.run.smithi039.stderr:    return func(*args, **kwargs)
2022-01-07T02:41:59.969 INFO:teuthology.orchestra.run.smithi039.stderr:  File "/usr/share/ceph/mgr/orchestrator/module.py", line 795, in _daemon_add_osd
2022-01-07T02:41:59.969 INFO:teuthology.orchestra.run.smithi039.stderr:    raise_if_exception(completion)
2022-01-07T02:41:59.969 INFO:teuthology.orchestra.run.smithi039.stderr:  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 224, in raise_if_exception
2022-01-07T02:41:59.969 INFO:teuthology.orchestra.run.smithi039.stderr:    raise e
2022-01-07T02:41:59.970 INFO:teuthology.orchestra.run.smithi039.stderr:RuntimeError: cephadm exited with an error code: 1, stderr:/bin/podman: --> passed data devices: 0 physical, 1 LVM

smithi039/log/fede34b0-6f62-11ec-8c32-001a4aab830c/ceph-mgr.smithi039.sgbygl.log.gz log:

/bin/podman: Running command: /usr/bin/ceph-authtool /var/lib/ceph/osd/ceph-0/keyring --create-keyring --name osd.0 --add-key AQB0qNdhNBsaDhAAGiZ21t6APPNjeS9w7sIqGg==
/bin/podman:  stdout: creating /var/lib/ceph/osd/ceph-0/keyring
/bin/podman: added entity osd.0 auth(key=AQB0qNdhNBsaDhAAGiZ21t6APPNjeS9w7sIqGg==)
/bin/podman: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring
/bin/podman: Running command: /usr/bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
/bin/podman: Running command: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osdspec-affinity None --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid 1284e78b-cc47-40dd-95c2-2f0f9d440e74 --setuser ceph --setgroup ceph
/bin/podman:  stderr: 2022-01-07T02:41:57.018+0000 7ff45e8a9080 -1 bluestore(/var/lib/ceph/osd/ceph-0/) _read_fsid unparsable uuid
/bin/podman:  stderr: 2022-01-07T02:41:58.295+0000 7ff45e8a9080 -1 bluefs _replay 0x0: stop: uuid 00000000-0000-0000-0000-000000000000 != super.uuid 44ae84a1-bb60-4228-aea8-04e4b7798d52, block dump:
/bin/podman:  stderr: 00000000  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
/bin/podman:  stderr: *
/bin/podman:  stderr: 00000ff0  00 00 00 00 00 00 00 00  00 00 00 00 00 00 00 00  |................|
/bin/podman:  stderr: 00001000
/bin/podman:  stderr: 2022-01-07T02:41:58.795+0000 7ff45e8a9080 -1 bluestore(/var/lib/ceph/osd/ceph-0/) _open_db erroring opening db:
/bin/podman:  stderr: 2022-01-07T02:41:59.308+0000 7ff45e8a9080 -1 OSD::mkfs: ObjectStore::mkfs failed with error (5) Input/output error
/bin/podman:  stderr: 2022-01-07T02:41:59.308+0000 7ff45e8a9080 -1 ^[[0;31m ** ERROR: error creating empty object store in /var/lib/ceph/osd/ceph-0/: (5) Input/output error^[[0m
/bin/podman: --> Was unable to complete a new OSD, will rollback changes

...

/bin/podman: -->  RuntimeError: Command failed with exit code 250: /usr/bin/ceph-osd --cluster ceph --osd-objectstore bluestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osdspec-affinity None --osd-data /var/lib/ceph/osd/ceph-0/ --osd-uuid 1284e78b-cc47-40dd-95c2-2f0f9d440e74 --setuser ceph --setgroup ceph
Traceback (most recent call last):
  File "/var/lib/ceph/fede34b0-6f62-11ec-8c32-001a4aab830c/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90", line 8029, in <module>
    main()
  File "/var/lib/ceph/fede34b0-6f62-11ec-8c32-001a4aab830c/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90", line 8017, in main
    r = ctx.func(ctx)
  File "/var/lib/ceph/fede34b0-6f62-11ec-8c32-001a4aab830c/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90", line 1654, in _infer_fsid
    return func(ctx)
  File "/var/lib/ceph/fede34b0-6f62-11ec-8c32-001a4aab830c/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90", line 1738, in _infer_image
    return func(ctx)
  File "/var/lib/ceph/fede34b0-6f62-11ec-8c32-001a4aab830c/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90", line 4514, in command_ceph_volume
    out, err, code = call_throws(ctx, c.run_cmd(), verbosity=verbosity)
  File "/var/lib/ceph/fede34b0-6f62-11ec-8c32-001a4aab830c/cephadm.30cb78bdbbafb384af862e1c2292b944f15942b586128e91262b43e91e11ae90", line 1464, in call_throws
    raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: /bin/podman run --rm --ipc=host --no-hosts --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk --init -e CONTAINER_IMAGE=docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 -e NODE_NAME=smithi039 -e CEPH_USE_RANDOM_NONCE=1 -e CEPH_VOLUME_OSDSPEC_AFFINITY=None -v /var/run/ceph/fede34b0-6f62-11ec-8c32-001a4aab830c:/var/run/ceph:z -v /var/log/ceph/fede34b0-6f62-11ec-8c32-001a4aab830c:/var/log/ceph:z -v /var/lib/ceph/fede34b0-6f62-11ec-8c32-001a4aab830c/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /var/lib/ceph/fede34b0-6f62-11ec-8c32-001a4aab830c/selinux:/sys/fs/selinux:ro -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm -v /tmp/ceph-tmpm1ia1yz8:/etc/ceph/ceph.conf:z -v /tmp/ceph-tmpfawz2wbg:/var/lib/ceph/bootstrap-osd/ceph.keyring:z docker.io/ceph/ceph@sha256:54e95ae1e11404157d7b329d0bef866ebbb214b195a009e87aae4eba9d282949 lvm batch --no-auto vg_nvme/lv_4 --yes --no-systemd
) v1 -- 0x55ecf3c32160 con 0x55ecf45dd800
2022-01-07T02:41:59.964+0000 7f977bfc3700 10 mgr.server operator()  command returned -22

Related issues

Related to Orchestrator - Bug #49287: podman: setting cgroup config for procHooks process caused: Unit libpod-$hash.scope not found New

History

#1 Updated by Laura Flores about 2 years ago

/a/yuriw-2022-01-11_19:17:55-rados-wip-yuri5-testing-2022-01-11-0843-distro-default-smithi/6608507

#2 Updated by Laura Flores about 2 years ago

/a/yuriw-2022-01-15_05:47:18-rados-wip-yuri8-testing-2022-01-14-1551-distro-default-smithi/6619373

#3 Updated by Sebastian Wagner about 2 years ago

  • Project changed from Orchestrator to sepia
  • Subject changed from cephadm exited with error code when creating osd. to cephadm exited with error code when creating osd: Input/Output error. Faulty NVME?
  • Category deleted (cephadm)

#4 Updated by Laura Flores about 2 years ago

/a/yuriw-2022-02-24_22:04:22-rados-wip-yuri7-testing-2022-02-17-0852-pacific-distro-default-smithi/6704759

#5 Updated by Laura Flores almost 2 years ago

Happens during mds upgrade sequence.

Description: rados/cephadm/mds_upgrade_sequence/{bluestore-bitmap centos_8.stream_container_tools conf/{client mds mon osd} overrides/{pg-warn whitelist_health whitelist_wrongly_marked_down} roles tasks/{0-from/v16.2.4 1-volume/{0-create 1-ranks/2 2-allow_standby_replay/yes 3-inline/no 4-verify} 2-client 3-upgrade-with-workload 4-verify}}

/a/yuriw-2022-04-01_01:23:52-rados-wip-yuri2-testing-2022-03-31-1523-pacific-distro-default-smithi/6770807

#6 Updated by Laura Flores over 1 year ago

/a/yuriw-2022-07-19_23:25:12-rados-wip-yuri2-testing-2022-07-15-0755-pacific-distro-default-smithi/6938951

#7 Updated by Laura Flores over 1 year ago

/a/yuriw-2022-08-11_16:46:00-rados-wip-yuri3-testing-2022-08-11-0809-pacific-distro-default-smithi/6968152

#8 Updated by Laura Flores over 1 year ago

/a/yuriw-2022-08-19_20:57:42-rados-wip-yuri6-testing-2022-08-19-0940-pacific-distro-default-smithi/6981439

#9 Updated by Laura Flores over 1 year ago

/a/yuriw-2022-08-25_17:32:22-rados-tchaikov-wip-pacific-update-fio-5-distro-default-smithi/6992848

#10 Updated by Laura Flores over 1 year ago

/a/yuriw-2022-09-09_14:59:25-rados-wip-yuri2-testing-2022-09-06-1007-pacific-distro-default-smithi/7022563

#11 Updated by Laura Flores 10 months ago

/a/yuriw-2023-04-04_21:18:37-rados-wip-yuri3-testing-2023-04-04-0833-pacific-distro-default-smithi/7232001

#12 Updated by Laura Flores 10 months ago

  • Related to Bug #49287: podman: setting cgroup config for procHooks process caused: Unit libpod-$hash.scope not found added

#13 Updated by Laura Flores 10 months ago

/a/yuriw-2023-04-04_21:18:37-rados-wip-yuri3-testing-2023-04-04-0833-pacific-distro-default-smithi/7232213

#14 Updated by Sridhar Seshasayee 7 months ago

/a/yuriw-2023-07-26_15:54:22-rados-wip-yuri6-testing-2023-07-24-0819-pacific-distro-default-smithi/7353411

#15 Updated by Laura Flores 6 months ago

/a/yuriw-2023-08-16_22:44:42-rados-wip-yuri7-testing-2023-08-16-1309-pacific-distro-default-smithi/7371630

#16 Updated by Laura Flores 6 months ago

/a/yuriw-2023-08-21_23:10:07-rados-pacific-release-distro-default-smithi/7375451

#17 Updated by Laura Flores 6 months ago

  • Tags set to test-failure

#18 Updated by Laura Flores about 1 month ago

/a/yuriw-2024-01-21_16:34:39-rados-wip-yuri4-testing-2024-01-18-1257-pacific-distro-default-smithi/7525019

Also available in: Atom PDF