Project

General

Profile

Bug #51818

"ceph orch host add" presents unhelpful error message if target host is missing cephadm

Added by Lars Kellogg-Stedman 2 months ago. Updated about 1 month ago.

Status:
Fix Under Review
Priority:
Normal
Category:
cephadm
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When using "ceph orch host add", if the remote host is missing cephadm
and its dependencies then "ceph orch host add" reports:

[root@ceph1 ~]# ceph orch host add ceph2
Error EINVAL: Host ceph2 (ceph2) failed check(s): []

That's a tremendously unhelpful error message for someone unfamiliar
with Ceph. I wonder if cephadm could be modified to report something
more along the lines of:

Error: Host ceph2 (ceph2) is missing cephadm

...or whatever particular requirement would allow cephadm to report a
more actionable error.

History

#1 Updated by Lars Kellogg-Stedman 2 months ago

Argh, sorry for the typos there, s/cephadm/ceph orch/ in several places.

#2 Updated by Sebastian Wagner 2 months ago

in any case this is broken. the list ([]) should contain the error message. Somehow it got lost, leading to a non-helpful error

#3 Updated by Michael Fritch about 1 month ago

  • Assignee set to Michael Fritch

#4 Updated by Michael Fritch about 1 month ago

  • Status changed from New to In Progress

#5 Updated by Michael Fritch about 1 month ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 42859

#6 Updated by Michael Fritch about 1 month ago

Likely caused when podman/docker are not present ...

from the mgr log:

2021-08-19T14:27:12.279-0600 7f9b837af640  0 log_channel(audit) log [DBG] : from='client.4407 -' entity='client.admin' cmd=[{"prefix": "orch host add", "hostname": "host4", "target": ["mon-mgr", ""]}]: dispatch
2021-08-19T14:27:13.219-0600 7f9b83fb0640 -1 mgr.server reply reply (22) Invalid argument Host host4 (host4) failed check(s): []

when attempting a check-host on bin/cephadm directly:

host4:~ # python3 /var/lib/ceph/beeb434c-9e41-42c7-b5dd-e2f2263dc540/cephadm.8f9eaee26dc5e0738ecde832d164c82d37874d66c170c3d5da51aea17d365601 check-host
Traceback (most recent call last):
  File "/var/lib/ceph/beeb434c-9e41-42c7-b5dd-e2f2263dc540/cephadm.8f9eaee26dc5e0738ecde832d164c82d37874d66c170c3d5da51aea17d365601", line 8459, in <module>
    main()
  File "/var/lib/ceph/beeb434c-9e41-42c7-b5dd-e2f2263dc540/cephadm.8f9eaee26dc5e0738ecde832d164c82d37874d66c170c3d5da51aea17d365601", line 8447, in main
    r = ctx.func(ctx)
  File "/var/lib/ceph/beeb434c-9e41-42c7-b5dd-e2f2263dc540/cephadm.8f9eaee26dc5e0738ecde832d164c82d37874d66c170c3d5da51aea17d365601", line 5795, in command_check_host
    container_path = ctx.container_engine.path
AttributeError: 'NoneType' object has no attribute 'path'

host4:~ # which podman
which: no podman in (/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin)
host4:~ # which docker
which: no docker in (/sbin:/usr/sbin:/usr/local/sbin:/root/bin:/usr/local/bin:/usr/bin:/bin)

Also available in: Atom PDF