Bug #58918
closedAddition of hosts to the cluster throws an exception
0%
Description
This issue is being observed off late with recent Crimson builds pulled from Shaman, this was not a problem previously.
While adding external hosts to the cluster, the following error was observed
[ceph: root@ceph-hakumar-7gpxph-node1-installer /]# ceph orch host add ceph-hakumar-7gpxph-node1-installer 10.0.210.112 _admin crash alertmanager mon mgr prometheus grafana installer node-exporter Added host 'ceph-hakumar-7gpxph-node1-installer' with addr '10.0.210.112' [ceph: root@ceph-hakumar-7gpxph-node1-installer /]# [ceph: root@ceph-hakumar-7gpxph-node1-installer /]# ceph orch host ls HOST ADDR LABELS STATUS ceph-hakumar-7gpxph-node1-installer 10.0.210.112 _admin,crash,alertmanager,mon,mgr,prometheus,grafana,installer,node-exporter 1 hosts in cluster [ceph: root@ceph-hakumar-7gpxph-node1-installer /]# ceph orch host add ceph-hakumar-7gpxph-node2 10.0.208.106 mon mgr mds rgw node-exporter crash alertmanager Error EINVAL: Traceback (most recent call last): File "/usr/share/ceph/mgr/mgr_module.py", line 1761, in _handle_command return self.handle_command(inbuf, cmd) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 171, in handle_command return dispatch[cmd['prefix']].call(self, cmd, inbuf) File "/usr/share/ceph/mgr/mgr_module.py", line 462, in call return self.func(mgr, **kwargs) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 107, in <lambda> wrapper_copy = lambda *l_args, **l_kwargs: wrapper(*l_args, **l_kwargs) # noqa: E731 File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 96, in wrapper return func(*args, **kwargs) File "/usr/share/ceph/mgr/orchestrator/module.py", line 453, in _add_host return self._apply_misc([s], False, Format.plain) File "/usr/share/ceph/mgr/orchestrator/module.py", line 1234, in _apply_misc raise_if_exception(completion) File "/usr/share/ceph/mgr/orchestrator/_interface.py", line 225, in raise_if_exception e = pickle.loads(c.serialized_exception) TypeError: __init__() missing 2 required positional arguments: 'hostname' and 'addr' [ceph: root@ceph-hakumar-7gpxph-node1-installer /]# ceph -v ceph version 18.0.0-2518-g0d8957f8 (0d8957f82a63da19616de782607dc4e04c312abe) reef (dev)
Interestingly, the issue is not observed when current host(installer/admin) is added with labels but only occurs when other nodes are being added to the cluster and the error complains about 'hostname' 'addr' arguments missing even though both have been provided in the ceph orch host add command
Crimson build/images on which the above issue was found -
https://shaman.ceph.com/builds/ceph/main/0d8957f82a63da19616de782607dc4e04c312abe/crimson/334069/ | cephadm version - 2:18.0.0-2518.g0d8957f8.el8
https://shaman.ceph.com/builds/ceph/main/4478c0941fb5e4873d296da2908bc92e27623b32/crimson/334479/ | cephadm version - 2:18.0.0-2686.g4478c094.el8