Project

General

Profile

Actions

Bug #46990

closed

execnet: EOFError: couldnt load message header, expected 9 bytes, got 0

Added by Sebastian Wagner over 3 years ago. Updated about 3 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

[ERR] MGR_MODULE_ERROR: Module 'cephadm' has failed: Failed to execute command: /usr/bin/python3 -u
    Module 'cephadm' has failed: Failed to execute command: /usr/bin/python3 -u
Aug 17 13:42:20 master bash[19272]: Traceback (most recent call last):
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/execnet/gateway_base.py", line 432, in from_io
Aug 17 13:42:20 master bash[19272]:     header = io.read(9)  # type 1, channel 4, payload 4
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
Aug 17 13:42:20 master bash[19272]:     raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
Aug 17 13:42:20 master bash[19272]: EOFError: expected 9 bytes, got 0
Aug 17 13:42:20 master bash[19272]: During handling of the above exception, another exception occurred:
Aug 17 13:42:20 master bash[19272]: Traceback (most recent call last):
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/remoto/process.py", line 188, in check
Aug 17 13:42:20 master bash[19272]:     response = result.receive(timeout)
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/execnet/gateway_base.py", line 749, in receive
Aug 17 13:42:20 master bash[19272]:     raise self._getremoteerror() or EOFError()
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/execnet/gateway_base.py", line 967, in _thread_receiver
Aug 17 13:42:20 master bash[19272]:     msg = Message.from_io(io)
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/execnet/gateway_base.py", line 437, in from_io
Aug 17 13:42:20 master bash[19272]:     raise EOFError("couldnt load message header, " + e.args[0])
Aug 17 13:42:20 master bash[19272]: EOFError: couldnt load message header, expected 9 bytes, got 0
Aug 17 13:42:20 master bash[19272]: During handling of the above exception, another exception occurred:
Aug 17 13:42:20 master bash[19272]: Traceback (most recent call last):
Aug 17 13:42:20 master bash[19272]:   File "/usr/share/ceph/mgr/cephadm/module.py", line 1035, in _remote_connection
Aug 17 13:42:20 master bash[19272]:     yield (conn, connr)
Aug 17 13:42:20 master bash[19272]:   File "/usr/share/ceph/mgr/cephadm/module.py", line 1131, in _run_cephadm
Aug 17 13:42:20 master bash[19272]:     stdin=script.encode('utf-8'))
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/remoto/process.py", line 209, in check
Aug 17 13:42:20 master bash[19272]:     'Failed to execute command: %s' % ' '.join(command)
Aug 17 13:42:20 master bash[19272]: RuntimeError: Failed to execute command: /usr/bin/python3 -u
Aug 17 13:42:20 master bash[19272]: debug 2020-08-17T11:42:20.900+0000 7f2bbda3c700 -1 log_channel(cephadm) log [ERR] : Failed to execute command: /usr/bin/python3 -u
Aug 17 13:42:20 master bash[19272]: Traceback (most recent call last):
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/execnet/gateway_base.py", line 432, in from_io
Aug 17 13:42:20 master bash[19272]:     header = io.read(9)  # type 1, channel 4, payload 4
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/execnet/gateway_base.py", line 402, in read
Aug 17 13:42:20 master bash[19272]:     raise EOFError("expected %d bytes, got %d" % (numbytes, len(buf)))
Aug 17 13:42:20 master bash[19272]: EOFError: expected 9 bytes, got 0
Aug 17 13:42:20 master bash[19272]: During handling of the above exception, another exception occurred:
Aug 17 13:42:20 master bash[19272]: Traceback (most recent call last):
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/remoto/process.py", line 188, in check
Aug 17 13:42:20 master bash[19272]:     response = result.receive(timeout)
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/execnet/gateway_base.py", line 749, in receive
Aug 17 13:42:20 master bash[19272]:     raise self._getremoteerror() or EOFError()
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/execnet/gateway_base.py", line 967, in _thread_receiver
Aug 17 13:42:20 master bash[19272]:     msg = Message.from_io(io)
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/execnet/gateway_base.py", line 437, in from_io
Aug 17 13:42:20 master bash[19272]:     raise EOFError("couldnt load message header, " + e.args[0])
Aug 17 13:42:20 master bash[19272]: EOFError: couldnt load message header, expected 9 bytes, got 0
Aug 17 13:42:20 master bash[19272]: During handling of the above exception, another exception occurred:
Aug 17 13:42:20 master bash[19272]: Traceback (most recent call last):
Aug 17 13:42:20 master bash[19272]:   File "/usr/share/ceph/mgr/cephadm/module.py", line 1035, in _remote_connection
Aug 17 13:42:20 master bash[19272]:     yield (conn, connr)
Aug 17 13:42:20 master bash[19272]:   File "/usr/share/ceph/mgr/cephadm/module.py", line 1131, in _run_cephadm
Aug 17 13:42:20 master bash[19272]:     stdin=script.encode('utf-8'))
Aug 17 13:42:20 master bash[19272]:   File "/usr/lib/python3.6/site-packages/remoto/process.py", line 209, in check
Aug 17 13:42:20 master bash[19272]:     'Failed to execute command: %s' % ' '.join(command)
Aug 17 13:42:20 master bash[19272]: RuntimeError: Failed to execute command: /usr/bin/python3 -u
Aug 17 13:42:21 master bash[19272]: Warning: Permanently added 'master' (ECDSA) to the list of known hosts.

execnet is again super helpful.

Fortunately, we were able to recover from this, as we're calling _reset_con() in that case.


Related issues 3 (0 open3 closed)

Related to Orchestrator - Bug #38757: mgr/ssh orchestrator doesn't workCan't reproduceNoah Watkins

Actions
Related to Orchestrator - Cleanup #44676: cephadm: Replace execnet (and remoto)ResolvedMelissa Li

Actions
Has duplicate Orchestrator - Bug #46764: cephadm (ceph orch apply) sometimes gets "stuck" and cannot deploy any OSDsCan't reproduce

Actions
Actions

Also available in: Atom PDF