Project

General

Profile

Bug #44832

cephadm: `ceph cephadm generate-key` fails with No such file or directory: '/tmp/...

Added by Sebastian Wagner 8 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
High
Category:
cephadm
Target version:
% Done:

0%

Source:
Tags:
low-hanging-fruit
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

root@ceph02:~# ceph cephadm generate-key
Error EINVAL: Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1413, in _generate_key
    with open(path, 'r') as f:
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp4ejhr7wh/key'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command
    return self.handle_command(inbuf, cmd)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command
    return dispatch[cmd['prefix']].call(self, cmd, inbuf)
  File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call
    return self.func(mgr, **kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda>
    wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
  File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper
    return func(*args, **kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1418, in _generate_key
    os.unlink(path)
FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp4ejhr7wh/key'

History

#1 Updated by Mario Ohnewald 8 months ago

Here is the Debug Log:

root@ceph01:~# ceph log last cephadm
2020-03-30T19:14:53.101702+0000 mgr.ceph02 (mgr.8374227) 77 : cephadm [INF] Generating ssh key...
2020-03-30T19:27:11.678834+0000 mgr.ceph02 (mgr.8374227) 467 : cephadm [ERR] _Promise failed
Traceback (most recent call last):
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
    s = io.read(1)
  File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
    raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
EOFError: expected 1 bytes, got 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1544, in _run_cephadm
    conn, connr = self._get_connection(addr)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1507, in _get_connection
    ssh_options=self._ssh_options)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 34, in __init__
    self.gateway = self._make_gateway(hostname)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 44, in _make_gateway
    self._make_connection_string(hostname)
  File "/lib/python3.6/site-packages/execnet/multi.py", line 134, in makegateway
    gw = gateway_bootstrap.bootstrap(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 102, in bootstrap
    bootstrap_exec(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 53, in bootstrap_exec
    raise HostNotFound(io.remoteaddress)
execnet.gateway_bootstrap.HostNotFound: -F /tmp/cephadm-conf-kbqvkrkw root@10.10.1.1

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work
    res = self._on_complete_(*args, **kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda>
    return cls(on_complete=lambda x: f(*x), value=args, name=name, **c_kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1645, in add_host
    error_ok=True, no_fsid=True)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1620, in _run_cephadm
    raise OrchestratorError('Failed to connect to %s (%s).  Check that the host is reachable and accepts connections using the cephadm SSH key' % (host, addr)) from e
orchestrator._interface.OrchestratorError: Failed to connect to 10.10.1.1 (10.10.1.1).  Check that the host is reachable and accepts connections using the cephadm SSH key
2020-03-30T19:27:18.019938+0000 mgr.ceph02 (mgr.8374227) 473 : cephadm [ERR] _Promise failed
Traceback (most recent call last):
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
    s = io.read(1)
  File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
    raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
EOFError: expected 1 bytes, got 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1544, in _run_cephadm
    conn, connr = self._get_connection(addr)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1507, in _get_connection
    ssh_options=self._ssh_options)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 34, in __init__
    self.gateway = self._make_gateway(hostname)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 44, in _make_gateway
    self._make_connection_string(hostname)
  File "/lib/python3.6/site-packages/execnet/multi.py", line 134, in makegateway
    gw = gateway_bootstrap.bootstrap(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 102, in bootstrap
    bootstrap_exec(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 53, in bootstrap_exec
    raise HostNotFound(io.remoteaddress)
execnet.gateway_bootstrap.HostNotFound: -F /tmp/cephadm-conf-kbqvkrkw root@10.10.1.2

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work
    res = self._on_complete_(*args, **kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda>
    return cls(on_complete=lambda x: f(*x), value=args, name=name, **c_kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1645, in add_host
    error_ok=True, no_fsid=True)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1620, in _run_cephadm
    raise OrchestratorError('Failed to connect to %s (%s).  Check that the host is reachable and accepts connections using the cephadm SSH key' % (host, addr)) from e
orchestrator._interface.OrchestratorError: Failed to connect to 10.10.1.2 (10.10.1.2).  Check that the host is reachable and accepts connections using the cephadm SSH key
2020-03-30T19:30:41.641353+0000 mgr.ceph02 (mgr.8374227) 577 : cephadm [INF] Generating ssh key...
2020-03-30T19:31:19.584348+0000 mgr.ceph02 (mgr.8374227) 604 : cephadm [ERR] _Promise failed
Traceback (most recent call last):
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
    s = io.read(1)
  File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
    raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
EOFError: expected 1 bytes, got 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1544, in _run_cephadm
    conn, connr = self._get_connection(addr)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1507, in _get_connection
    ssh_options=self._ssh_options)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 34, in __init__
    self.gateway = self._make_gateway(hostname)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 44, in _make_gateway
    self._make_connection_string(hostname)
  File "/lib/python3.6/site-packages/execnet/multi.py", line 134, in makegateway
    gw = gateway_bootstrap.bootstrap(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 102, in bootstrap
    bootstrap_exec(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 53, in bootstrap_exec
    raise HostNotFound(io.remoteaddress)
execnet.gateway_bootstrap.HostNotFound: -F /tmp/cephadm-conf-kbqvkrkw root@10.10.1.2

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work
    res = self._on_complete_(*args, **kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda>
    return cls(on_complete=lambda x: f(*x), value=args, name=name, **c_kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1645, in add_host
    error_ok=True, no_fsid=True)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1620, in _run_cephadm
    raise OrchestratorError('Failed to connect to %s (%s).  Check that the host is reachable and accepts connections using the cephadm SSH key' % (host, addr)) from e
orchestrator._interface.OrchestratorError: Failed to connect to 10.10.1.2 (10.10.1.2).  Check that the host is reachable and accepts connections using the cephadm SSH key
2020-03-30T19:31:34.624952+0000 mgr.ceph02 (mgr.8374227) 614 : cephadm [ERR] _Promise failed
Traceback (most recent call last):
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
    s = io.read(1)
  File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
    raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
EOFError: expected 1 bytes, got 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1544, in _run_cephadm
    conn, connr = self._get_connection(addr)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1507, in _get_connection
    ssh_options=self._ssh_options)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 34, in __init__
    self.gateway = self._make_gateway(hostname)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 44, in _make_gateway
    self._make_connection_string(hostname)
  File "/lib/python3.6/site-packages/execnet/multi.py", line 134, in makegateway
    gw = gateway_bootstrap.bootstrap(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 102, in bootstrap
    bootstrap_exec(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 53, in bootstrap_exec
    raise HostNotFound(io.remoteaddress)
execnet.gateway_bootstrap.HostNotFound: -F /tmp/cephadm-conf-kbqvkrkw root@10.10.1.2

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work
    res = self._on_complete_(*args, **kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda>
    return cls(on_complete=lambda x: f(*x), value=args, name=name, **c_kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1645, in add_host
    error_ok=True, no_fsid=True)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1620, in _run_cephadm
    raise OrchestratorError('Failed to connect to %s (%s).  Check that the host is reachable and accepts connections using the cephadm SSH key' % (host, addr)) from e
orchestrator._interface.OrchestratorError: Failed to connect to 10.10.1.2 (10.10.1.2).  Check that the host is reachable and accepts connections using the cephadm SSH key
2020-03-30T19:36:11.743024+0000 mgr.ceph02 (mgr.8374227) 760 : cephadm [ERR] _Promise failed
Traceback (most recent call last):
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
    s = io.read(1)
  File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
    raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
EOFError: expected 1 bytes, got 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1544, in _run_cephadm
    conn, connr = self._get_connection(addr)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1507, in _get_connection
    ssh_options=self._ssh_options)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 34, in __init__
    self.gateway = self._make_gateway(hostname)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 44, in _make_gateway
    self._make_connection_string(hostname)
  File "/lib/python3.6/site-packages/execnet/multi.py", line 134, in makegateway
    gw = gateway_bootstrap.bootstrap(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 102, in bootstrap
    bootstrap_exec(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 53, in bootstrap_exec
    raise HostNotFound(io.remoteaddress)
execnet.gateway_bootstrap.HostNotFound: -F /tmp/cephadm-conf-kbqvkrkw root@10.10.1.2

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 444, in do_work
    res = self._on_complete_(*args, **kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 512, in <lambda>
    return cls(on_complete=lambda x: f(*x), value=args, name=name, **c_kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1645, in add_host
    error_ok=True, no_fsid=True)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1620, in _run_cephadm
    raise OrchestratorError('Failed to connect to %s (%s).  Check that the host is reachable and accepts connections using the cephadm SSH key' % (host, addr)) from e
orchestrator._interface.OrchestratorError: Failed to connect to 10.10.1.2 (10.10.1.2).  Check that the host is reachable and accepts connections using the cephadm SSH key

#2 Updated by Sebastian Wagner 7 months ago

  • Status changed from New to Fix Under Review
  • Assignee set to Sebastian Wagner
  • Pull request ID set to 34691

#3 Updated by Sebastian Wagner 7 months ago

  • Status changed from Fix Under Review to Pending Backport

#4 Updated by Sebastian Wagner 7 months ago

  • Status changed from Pending Backport to Resolved
  • Target version set to v15.2.2

#5 Updated by Vladimir Pakhomov 6 months ago

Seems the target version does not include fix for the issue:
...
INFO:cephadm:Generating ssh key...
INFO:cephadm:Non-zero exit code 22 from /usr/bin/docker run --rm --net=host -e CONTAINER_IMAGE=docker.io/ceph/ceph:v15 -e NODE_NAME=ceph1 -v /var/log/ceph/90a85836-9db3-11ea-b945-000c29301e0d:/var/log/ceph:z -v /tmp/ceph-tmpbjezrapy:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmpd8o_o_9i:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph docker.io/ceph/ceph:v15 cephadm generate-key
INFO:cephadm:/usr/bin/ceph:stderr Error EINVAL: Traceback (most recent call last):
INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/cephadm/module.py", line 1433, in _generate_key
INFO:cephadm:/usr/bin/ceph:stderr '-f', path
INFO:cephadm:/usr/bin/ceph:stderr File "/lib64/python3.6/subprocess.py", line 311, in check_call
INFO:cephadm:/usr/bin/ceph:stderr raise CalledProcessError(retcode, cmd)
INFO:cephadm:/usr/bin/ceph:stderr subprocess.CalledProcessError: Command '['/usr/bin/ssh-keygen', '-C', 'ceph-90a85836-9db3-11ea-b945-000c29301e0d', '-N', '', '-f', '/tmp/tmp80sns6qz/key']' returned non-zero exit status 255.
INFO:cephadm:/usr/bin/ceph:stderr
INFO:cephadm:/usr/bin/ceph:stderr During handling of the above exception, another exception occurred:
INFO:cephadm:/usr/bin/ceph:stderr
INFO:cephadm:/usr/bin/ceph:stderr Traceback (most recent call last):
INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/mgr_module.py", line 1153, in _handle_command
INFO:cephadm:/usr/bin/ceph:stderr return self.handle_command(inbuf, cmd)
INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 110, in handle_command
INFO:cephadm:/usr/bin/ceph:stderr return dispatch[cmd['prefix']].call(self, cmd, inbuf)
INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/mgr_module.py", line 308, in call
INFO:cephadm:/usr/bin/ceph:stderr return self.func(mgr, **kwargs)
INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 72, in <lambda>
INFO:cephadm:/usr/bin/ceph:stderr wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs)
INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 63, in wrapper
INFO:cephadm:/usr/bin/ceph:stderr return func(*args, **kwargs)
INFO:cephadm:/usr/bin/ceph:stderr File "/usr/share/ceph/mgr/cephadm/module.py", line 1440, in _generate_key
INFO:cephadm:/usr/bin/ceph:stderr os.unlink(path)
INFO:cephadm:/usr/bin/ceph:stderr FileNotFoundError: [Errno 2] No such file or directory: '/tmp/tmp80sns6qz/key'
...
[root@ceph1 ~]# yum info cephadm
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile * base: mirror.reconn.ru * epel: mirror.yandex.ru * extras: mirror.docker.ru * updates: centos-mirror.rbc.ru
Installed Packages
Name : cephadm
Arch : x86_64
Epoch : 2
Version : 15.2.2
Release : 0.el7
Size : 165 k
Repo : installed
From repo : Ceph
Summary : Utility to bootstrap Ceph clusters
URL : http://ceph.com/
License : LGPL-2.1 and LGPL-3.0 and CC-BY-SA-3.0 and GPL-2.0 and BSL-1.0 and BSD-3-Clause
: and MIT
Description : Utility to bootstrap a Ceph cluster and manage Ceph daemons deployed
: with systemd and podman.

#6 Updated by Nathan Cutler 6 months ago

@Vladimir: with Octopus, to find the version you are using it is no longer sufficient to examine RPMs. Please post the output of "ceph versions" as well?

Also available in: Atom PDF