finally had a chance to test this and the errors reported from the underlying remoto lib are less than helpful:
Nov 11 22:26:05 node1 conmon[811925]: debug 2020-11-11T21:26:05.796+0000 7fef60ace700 0 log_channel(audit) log [DBG] : from='client.14272 -' entity='client.admin' cmd=[{"prefix": "orch host add", "hostname": "node1", "target": ["mon-mgr", ""]}]: dispatch
Nov 11 22:26:05 node1 conmon[811925]: [31B blob data]
Nov 11 22:26:05 node1 conmon[811925]: debug 2020-11-11T21:26:05.872+0000 7fef6bf74700 0 [cephadm ERROR orchestrator._interface] _Promise failed
Nov 11 22:26:05 node1 conmon[811925]: Traceback (most recent call last):
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
Nov 11 22:26:05 node1 conmon[811925]: s = io.read(1)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
Nov 11 22:26:05 node1 conmon[811925]: raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
Nov 11 22:26:05 node1 conmon[811925]: EOFError: expected 1 bytes, got 0
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: During handling of the above exception, another exception occurred:
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: Traceback (most recent call last):
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 998, in _remote_connection
Nov 11 22:26:05 node1 conmon[811925]: conn, connr = self._get_connection(addr)
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 961, in _get_connection
Nov 11 22:26:05 node1 conmon[811925]: sudo=True if self.ssh_user != 'root' else False)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 34, in __init__
Nov 11 22:26:05 node1 conmon[811925]: self.gateway = self._make_gateway(hostname)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 44, in _make_gateway
Nov 11 22:26:05 node1 conmon[811925]: self._make_connection_string(hostname)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/execnet/multi.py", line 134, in makegateway
Nov 11 22:26:05 node1 conmon[811925]: gw = gateway_bootstrap.bootstrap(io, spec)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 102, in bootstrap
Nov 11 22:26:05 node1 conmon[811925]: bootstrap_exec(io, spec)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 53, in bootstrap_exec
Nov 11 22:26:05 node1 conmon[811925]: raise HostNotFound(io.remoteaddress)
Nov 11 22:26:05 node1 conmon[811925]: execnet.gateway_bootstrap.HostNotFound: -F /tmp/cephadm-conf-lqkx72b8 -i /tmp/cephadm-identity-rx03t2fk root@node1
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: The above exception was the direct cause of the following exception:
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: Traceback (most recent call last):
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 295, in _finalize
Nov 11 22:26:05 node1 conmon[811925]: next_result = self._on_complete(self._value)
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 108, in <lambda>
Nov 11 22:26:05 node1 conmon[811925]: return CephadmCompletion(on_complete=lambda _: f(*args, **kwargs))
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 1190, in add_host
Nov 11 22:26:05 node1 conmon[811925]: return self._add_host(spec)
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 1176, in _add_host
Nov 11 22:26:05 node1 conmon[811925]: error_ok=True, no_fsid=True)
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 1079, in _run_cephadm
Nov 11 22:26:05 node1 conmon[811925]: with self._remote_connection(host, addr) as tpl:
Nov 11 22:26:05 node1 conmon[811925]: File "/lib64/python3.6/contextlib.py", line 81, in __enter__
Nov 11 22:26:05 node1 conmon[811925]: return next(self.gen)
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 1025, in _remote_connection
Nov 11 22:26:05 node1 conmon[811925]: raise OrchestratorError(msg) from e
Nov 11 22:26:05 node1 conmon[811925]: orchestrator._interface.OrchestratorError: Failed to connect to node1 (node1).
Nov 11 22:26:05 node1 conmon[811925]: Please make sure that the host is reachable and accepts connections using the cephadm SSH key
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: To add the cephadm SSH key to the host:
Nov 11 22:26:05 node1 conmon[811925]: > ceph cephadm get-pub-key > ~/ceph.pub
Nov 11 22:26:05 node1 conmon[811925]: > ssh-copy-id -f -i ~/ceph.pub root@node1
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: To check that the host is reachable:
Nov 11 22:26:05 node1 conmon[811925]: > ceph cephadm get-ssh-config > ssh_config
Nov 11 22:26:05 node1 conmon[811925]: > ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_private_key
Nov 11 22:26:05 node1 conmon[811925]: > ssh -F ssh_config -i ~/cephadm_private_key root@node1
Nov 11 22:26:05 node1 conmon[811925]: debug 2020-11-11T21:26:05.872+0000 7fef6bf74700 -1 log_channel(cephadm) log [ERR] : _Promise failed
Nov 11 22:26:05 node1 conmon[811925]: Traceback (most recent call last):
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
Nov 11 22:26:05 node1 conmon[811925]: s = io.read(1)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
Nov 11 22:26:05 node1 conmon[811925]: raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
Nov 11 22:26:05 node1 conmon[811925]: EOFError: expected 1 bytes, got 0
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: During handling of the above exception, another exception occurred:
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: Traceback (most recent call last):
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 998, in _remote_connection
Nov 11 22:26:05 node1 conmon[811925]: conn, connr = self._get_connection(addr)
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 961, in _get_connection
Nov 11 22:26:05 node1 conmon[811925]: sudo=True if self.ssh_user != 'root' else False)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 34, in __init__
Nov 11 22:26:05 node1 conmon[811925]: self.gateway = self._make_gateway(hostname)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 44, in _make_gateway
Nov 11 22:26:05 node1 conmon[811925]: self._make_connection_string(hostname)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/execnet/multi.py", line 134, in makegateway
Nov 11 22:26:05 node1 conmon[811925]: gw = gateway_bootstrap.bootstrap(io, spec)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 102, in bootstrap
Nov 11 22:26:05 node1 conmon[811925]: bootstrap_exec(io, spec)
Nov 11 22:26:05 node1 conmon[811925]: File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 53, in bootstrap_exec
Nov 11 22:26:05 node1 conmon[811925]: raise HostNotFound(io.remoteaddress)
Nov 11 22:26:05 node1 conmon[811925]: execnet.gateway_bootstrap.HostNotFound: -F /tmp/cephadm-conf-lqkx72b8 -i /tmp/cephadm-identity-rx03t2fk root@node1
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: The above exception was the direct cause of the following exception:
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: Traceback (most recent call last):
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 295, in _finalize
Nov 11 22:26:05 node1 conmon[811925]: next_result = self._on_complete(self._value)
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 108, in <lambda>
Nov 11 22:26:05 node1 conmon[811925]: return CephadmCompletion(on_complete=lambda _: f(*args, **kwargs))
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 1190, in add_host
Nov 11 22:26:05 node1 conmon[811925]: return self._add_host(spec)
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 1176, in _add_host
Nov 11 22:26:05 node1 conmon[811925]: error_ok=True, no_fsid=True)
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 1079, in _run_cephadm
Nov 11 22:26:05 node1 conmon[811925]: with self._remote_connection(host, addr) as tpl:
Nov 11 22:26:05 node1 conmon[811925]: File "/lib64/python3.6/contextlib.py", line 81, in __enter__
Nov 11 22:26:05 node1 conmon[811925]: return next(self.gen)
Nov 11 22:26:05 node1 conmon[811925]: File "/usr/share/ceph/mgr/cephadm/module.py", line 1025, in _remote_connection
Nov 11 22:26:05 node1 conmon[811925]: raise OrchestratorError(msg) from e
Nov 11 22:26:05 node1 conmon[811925]: orchestrator._interface.OrchestratorError: Failed to connect to node1 (node1).
Nov 11 22:26:05 node1 conmon[811925]: Please make sure that the host is reachable and accepts connections using the cephadm SSH key
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: To add the cephadm SSH key to the host:
Nov 11 22:26:05 node1 conmon[811925]: > ceph cephadm get-pub-key > ~/ceph.pub
Nov 11 22:26:05 node1 conmon[811925]: > ssh-copy-id -f -i ~/ceph.pub root@node1
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: To check that the host is reachable:
Nov 11 22:26:05 node1 conmon[811925]: > ceph cephadm get-ssh-config > ssh_config
Nov 11 22:26:05 node1 conmon[811925]: > ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_private_key
Nov 11 22:26:05 node1 conmon[811925]: > ssh -F ssh_config -i ~/cephadm_private_key root@node1
Nov 11 22:26:05 node1 conmon[811925]: debug 2020-11-11T21:26:05.876+0000 7fef6bf74700 -1 mgr.server reply reply (22) Invalid argument Failed to connect to node1 (node1).
Nov 11 22:26:05 node1 conmon[811925]: Please make sure that the host is reachable and accepts connections using the cephadm SSH key
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: To add the cephadm SSH key to the host:
Nov 11 22:26:05 node1 conmon[811925]: > ceph cephadm get-pub-key > ~/ceph.pub
Nov 11 22:26:05 node1 conmon[811925]: > ssh-copy-id -f -i ~/ceph.pub root@node1
Nov 11 22:26:05 node1 conmon[811925]:
Nov 11 22:26:05 node1 conmon[811925]: To check that the host is reachable:
Nov 11 22:26:05 node1 conmon[811925]: > ceph cephadm get-ssh-config > ssh_config
Nov 11 22:26:05 node1 conmon[811925]: > ceph config-key get mgr/cephadm/ssh_identity_key > ~/cephadm_private_key
Nov 11 22:26:05 node1 conmon[811925]: > ssh -F ssh_config -i ~/cephadm_private_key root@node1
Nov 11 22:26:07 node1 conmon[811925]: debug