Exception adding host using cephadm
After bootstrapping 1st host using cephadm, attempting to add another host fails with an exception (variable referencing error).
Ceph version: 15.2.2 (node 1 installed with cephadm)
OS Version: Ubuntu 18.04
Docker Version: 19.03.6-0ubuntu1~18.04.1
Step(s) to reproduce:
ceph0 $ ceph orch host add ceph1 # observe with `ceph -W cephadm` 020-06-19T13:44:42.435058+1200 mgr.ceph0.ufqhzn [ERR] _Promise failed Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/module.py", line 457, in do_work res = self._on_complete_(*args, **kwargs) File "/usr/share/ceph/mgr/cephadm/module.py", line 525, in <lambda> return cls(on_complete=lambda x: f(*x), value=args, name=name, **c_kwargs) File "/usr/share/ceph/mgr/cephadm/module.py", line 1685, in add_host if code: orchestrator._interface.OrchestratorError: New host ceph1 (ceph1) failed check: ['Traceback (most recent call last):', ' File "<stdin>", line 4580, in <module>', ' File "<stdin>", line 3592, in command_check_host', "UnboundLocalError: local variable 'container_path' referenced before assignment"]
It looks like the variable 'container_path' needs to be indicated as 'global' in command_check_host (see patch). Making this change in the mgr container (and restarting) gets a successful host add.
#9 Updated by Stephan Müller about 1 year ago
- Status changed from In Progress to Need More Info
I was not yet able to reproduce it. (Tried a lot of things.)
I added new hosts to bootstrapped clusters, removed and added the same host with different synonyms (hostname / IP / FQDN) and much more like changing the hostname.
@Mark Kirkwood and @Dan Mick, could you provide more information on how to reproduce it and how your setup looks like?
#11 Updated by Mark Kirkwood about 1 year ago
@Stephan Müller, I'd suggest starting with some freshly built VMs (mine were Ubuntu 18.04). Optionally set up the Ceph repos on all of them to get Octopus (I did this). Also I didn't have these VMs in DNS (so just set up /etc/hosts on each of them)Then:
- download cephadmin on the host to-be the mon
- bootstrap it as a mon
- install ceph-common
- copy ssh-id to next host
- add it (hopefully triggering the bug)
Offhand I can't recall if I installed docker on the bootstrap host before bootstrap
#14 Updated by Victor Moreno about 1 year ago
I have hit this aswell installing cephadm on debian 10 buster with an apt upgrade done.
I have the playbook that configures the ceph-cluster here if can help:
#15 Updated by Tobias Fischer about 1 year ago
same here. trying to add a fresh debian buster VM with all updates installed (no additional packages like docker present):
email@example.com.:~# ceph orch host add rgw1
Error ENOENT: New host rgw1 (rgw1) failed check: ['Traceback (most recent call last):', ' File "<stdin>", line 4762, in <module>', ' File "<stdin>", line 3738, in command_check_host', "UnboundLocalError: local variable 'container_path' referenced before assignment"]
but fix proposed by Mark Kirkwood was already there. After Installing Docker on rgw1 error was gone.