Project

General

Profile

Bug #44356

ceph-volume inventory: KeyError: 'ceph.cluster_name'

Added by Sebastian Wagner 4 months ago. Updated 4 months ago.

Status:
New
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

 Module 'cephadm' has failed: cephadm exited with an error code: 1, stderr:INFO:cephadm:/usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices.
INFO:cephadm:/usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices.
INFO:cephadm:/usr/bin/podman:stderr -->  KeyError: 'ceph.cluster_name'
Traceback (most recent call last):
  File "<stdin>", line 3394, in <module>
  File "<stdin>", line 688, in _infer_fsid
  File "<stdin>", line 2202, in command_ceph_volume
  File "<stdin>", line 513, in call_throws
RuntimeError: Failed command: /usr/bin/podman run --rm --net=host --privileged --group-add=disk -e CONTAINER_IMAGE=registry.suse.de/suse/sle-15-sp2/update/products/ses7/update/cr/containers/ses/7/ceph/ceph:latest -e NODE_NAME=hses-node1 -v /var/run/ceph/002c389e-54fd-11ea-a99f-52540044d765:/var/run/ceph:z -v /var/log/ceph/002c389e-54fd-11ea-a99f-52540044d765:/var/log/ceph:z -v /var/lib/ceph/002c389e-54fd-11ea-a99f-52540044d765/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm --entrypoint /usr/sbin/ceph-volume registry.suse.de/suse/sle-15-sp2/update/products/ses7/update/cr/containers/ses/7/ceph/ceph:latest inventory --format=json
hses-node1:~ # ceph -s
  cluster:
    id:     002c389e-54fd-11ea-a99f-52540044d765
    health: HEALTH_ERR
            1 filesystem is offline
            1 filesystem is online with fewer MDS than max_mds
            Module 'cephadm' has failed: cephadm exited with an error code: 1, stderr:INFO:cephadm:/usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices.
INFO:cephadm:/usr/bin/podman:stderr WARNING: The same type, major and minor should not be used for multiple devices.
INFO:cephadm:/usr/bin/podman:stderr -->  KeyError: 'ceph.cluster_name'
Traceback (most recent call last):
  File "<stdin>", line 3394, in <module>
  File "<stdin>", line 688, in _infer_fsid
  File "<stdin>", line 2202, in command_ceph_volume
  File "<stdin>", line 513, in call_throws
RuntimeError: Failed command: /usr/bin/podman run --rm --net=host --privileged --group-add=disk -e CONTAINER_IMAGE=registry.suse.de/suse/sle-15-sp2/update/products/ses7/update/cr/containers/ses/7/ceph/ceph:latest -e NODE_NAME=hses-node1 -v /var/run/ceph/002c389e-54fd-11ea-a99f-52540044d765:/var/run/ceph:z -v /var/log/ceph/002c389e-54fd-11ea-a99f-52540044d765:/var/log/ceph:z -v /var/lib/ceph/002c389e-54fd-11ea-a99f-52540044d765/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm --entrypoint /usr/sbin/ceph-volume registry.suse.de/suse/sle-15-sp2/update/products/ses7/update/cr/containers/ses/7/ceph/ceph:latest inventory --format=json

  services:
    mon: 4 daemons, quorum hses-node1,hses-node2,hses-node3,hses-node4 (age 11m)
    mgr: hses-node1.jxzdin(active, since 13m), standbys: hses-node1.aogwfz, hses-node2.dlyvwy, hses-node3.delhzp, hses-node4.vgmgec
    mds: myfs:0
    osd: 8 osds: 8 up (since 6m), 8 in (since 6m)
    rgw: 4 daemons active (default.default.hses-node1.sirofe, default.default.hses-node2.kvzapg, default.default.hses-node3.dhobfn, default.default.hses-node4.cuulnm)

  task status:

  data:
    pools:   6 pools, 168 pgs
    objects: 189 objects, 5.8 KiB
    usage:   8.2 GiB used, 152 GiB / 160 GiB avail
    pgs:     168 active+clean

Related issues

Duplicated by Orchestrator - Bug #45604: mgr/cephadm: Failed to create an OSD Duplicate

History

#1 Updated by Sage Weil 4 months ago

  • Project changed from Orchestrator to ceph-volume
  • Subject changed from cephadm: ceph-volume inventory: KeyError: 'ceph.cluster_name' to ceph-volume inventory: KeyError: 'ceph.cluster_name'

#2 Updated by Jan Fajerski 4 months ago

Can we get a stack trace from ceph-volume for this? Setting env CEPH_VOLUME_DEBUG=true would do the trick.

#3 Updated by Jan Fajerski 4 months ago

  • Assignee set to Jan Fajerski

#4 Updated by Jan Fajerski 4 months ago

Just from looking at the code path, I'd say there was an lv, that had a ceph.osd_id tag set but no ceph.cluster tag.

Is this reproducible and if so how? I can tighten some tests to avoid the situation described above, but that just a good guess.

#5 Updated by Sebastian Wagner about 2 months ago

  • Duplicated by Bug #45604: mgr/cephadm: Failed to create an OSD added

Also available in: Atom PDF