Project

General

Profile

Bug #45672

Unable to add additional hosts to cluster using cephadm

Added by Dan Skaggs almost 4 years ago. Updated over 3 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After configuring nodes 2 and 3 with permission for the node 1 root user to SSH with Ceph's configuration and key, command `ceph orch host add node2` is unable to connect.

Environment:
Ceph version: 15.2.2 (node 1 installed with cephadm)
OS Version: Ubuntu 18.04
Docker Version: Docker CE 19.03.9, build 9d988398e7

STR:
1. verify `ping node2` is successful
2. ceph cephadm get-ssh-config > ceph_config
3. ceph config-key get mgr/cephadm/ssh_identity_key
4. ssh -F ./ceph_config -i ./ceph_key root@node2
5. Observe manual SSH connection is successful
6. Run `ceph orch host add node2` and observe the following error while running `ceph -W cephadm`

Error:
2020-05-23T01:23:46.735873+0000 mgr.node1.pfnxpe [ERR] _Promise failed
Traceback (most recent call last):
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
    s = io.read(1)
  File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
    raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
EOFError: expected 1 bytes, got 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1569, in _run_cephadm
    conn, connr = self._get_connection(addr)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1529, in _get_connection
    ssh_options=self._ssh_options)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 34, in __init__
    self.gateway = self._make_gateway(hostname)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 44, in _make_gateway
    self._make_connection_string(hostname)
  File "/lib/python3.6/site-packages/execnet/multi.py", line 134, in makegateway
    gw = gateway_bootstrap.bootstrap(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 102, in bootstrap
    bootstrap_exec(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 53, in bootstrap_exec
    raise HostNotFound(io.remoteaddress)
execnet.gateway_bootstrap.HostNotFound: -F /tmp/cephadm-conf-lq9eq8la -i /tmp/cephadm-identity-k5yb36z7 root@node2

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 457, in do_work
    res = self._on_complete_(*args, **kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 525, in <lambda>
    return cls(on_complete=lambda x: f(*x), value=args, name=name, **c_kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1682, in add_host
    error_ok=True, no_fsid=True)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1657, in _run_cephadm
    raise OrchestratorError(msg) from e
orchestrator._interface.OrchestratorError: Failed to connect to node2 (node2).  Check that the host is reachable and accepts connections using the cephadm SSH key
you may want to run:
> ssh -F =(ceph cephadm get-ssh-config) -i =(ceph config-key get mgr/cephadm/ssh_identity_key) root@node2

History

#1 Updated by Sebastian Wagner almost 4 years ago

  • Project changed from Ceph to Orchestrator

#2 Updated by Sebastian Wagner almost 4 years ago

execnet is again very helpful with their exceptions this time.

#3 Updated by Sebastian Wagner almost 4 years ago

might want to run

ceph mgr fail 

#4 Updated by Dan Skaggs almost 4 years ago

I wound up getting around this by using an Ansible role in which this worked successfully. You can feel free to close this if no one else is reporting it.

#5 Updated by Joshua Schmid over 3 years ago

  • Status changed from New to Can't reproduce

thanks, closing

Also available in: Atom PDF