Project

General

Profile

Actions

Bug #44018

closed

cephadm: down host kills serve() thread

Added by Sage Weil about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2020-02-06T14:57:42.284+0000 7f02bb142700 -1 log_channel(cluster) log [ERR] : Unhandled exception from module 'cephadm' while running on mgr.dzsdhz: -F /tmp/cephadm-conf-ue7uclq1 -i /tmp/cephadm-identity-gfnk4nop root@eutow
2020-02-06T14:57:42.284+0000 7f02bb142700 -1 cephadm.serve:
2020-02-06T14:57:42.284+0000 7f02bb142700 -1 Traceback (most recent call last):
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 48, in bootstrap_exec
    s = io.read(1)
  File "/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
    raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
EOFError: expected 1 bytes, got 0

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/share/ceph/mgr/cephadm/module.py", line 676, in serve
    self._check_for_strays()
  File "/usr/share/ceph/mgr/cephadm/module.py", line 612, in _check_for_strays
    orchestrator.raise_if_exception(completion)
  File "/usr/share/ceph/mgr/orchestrator.py", line 655, in raise_if_exception
    raise e
  File "/lib64/python3.6/multiprocessing/pool.py", line 119, in worker
    result = (True, func(*args, **kwds))
  File "/lib64/python3.6/multiprocessing/pool.py", line 44, in mapstar
    return list(map(*args))
  File "/usr/share/ceph/mgr/cephadm/module.py", line 136, in do_work
    res = self._on_complete_(*args, **kwargs)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 203, in call_self
    return f(self, *inner_args)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 1093, in _refresh_host_services
    host, 'mon', 'ls', [], no_fsid=True)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 949, in _run_cephadm
    conn, connr = self._get_connection(host)
  File "/usr/share/ceph/mgr/cephadm/module.py", line 919, in _get_connection
    ssh_options=self._ssh_options)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 34, in __init__
    self.gateway = self._make_gateway(hostname)
  File "/lib/python3.6/site-packages/remoto/backends/__init__.py", line 44, in _make_gateway
    self._make_connection_string(hostname)
  File "/lib/python3.6/site-packages/execnet/multi.py", line 134, in makegateway
    gw = gateway_bootstrap.bootstrap(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 102, in bootstrap
    bootstrap_exec(io, spec)
  File "/lib/python3.6/site-packages/execnet/gateway_bootstrap.py", line 53, in bootstrap_exec
    raise HostNotFound(io.remoteaddress)
execnet.gateway_bootstrap.HostNotFound: -F /tmp/cephadm-conf-ue7uclq1 -i /tmp/cephadm-identity-gfnk4nop root@eutow
Actions #1

Updated by Sage Weil about 4 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 33139
Actions #2

Updated by Sage Weil about 4 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF