Project

General

Profile

Actions

Bug #65229

open

Failed to reconnect to smithiXXX

Added by Laura Flores about 1 month ago. Updated 13 days ago.

Status:
In Progress
Priority:
Normal
Assignee:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

/a/teuthology-2024-03-22_02:08:13-upgrade-squid-distro-default-smithi/7616027

2024-03-22T18:58:20.321 INFO:teuthology.kill:Killing Pids: {2611346}
2024-03-22T18:58:21.291 INFO:teuthology.task.internal:roles: ubuntu@smithi044.front.sepia.ceph.com - ['host.a', 'client.0', 'osd.0', 'osd.1', 'osd.2']
2024-03-22T18:58:21.292 INFO:teuthology.task.internal:roles: ubuntu@smithi189.front.sepia.ceph.com - ['host.b', 'client.1', 'osd.3', 'osd.4', 'osd.5']
2024-03-22T18:58:21.292 INFO:teuthology.misc:Compressing logs...
2024-03-22T18:58:21.292 INFO:teuthology.orchestra.remote:Trying to reconnect to host 'ubuntu@smithi044.front.sepia.ceph.com'
2024-03-22T18:58:21.293 DEBUG:teuthology.orchestra.connection:{'hostname': 'smithi044.front.sepia.ceph.com', 'username': 'ubuntu', 'timeout': 60}
2024-03-22T18:58:23.422 DEBUG:teuthology.orchestra.remote:[Errno None] Unable to connect to port 22 on 172.21.15.44
2024-03-22T18:58:32.430 INFO:teuthology.orchestra.remote:Trying to reconnect to host 'ubuntu@smithi044.front.sepia.ceph.com'
2024-03-22T18:58:32.431 DEBUG:teuthology.orchestra.connection:{'hostname': 'smithi044.front.sepia.ceph.com', 'username': 'ubuntu', 'timeout': 60}
2024-03-22T18:58:32.639 DEBUG:teuthology.orchestra.remote:[Errno None] Unable to connect to port 22 on 172.21.15.44
2024-03-22T18:58:44.647 INFO:teuthology.orchestra.remote:Trying to reconnect to host 'ubuntu@smithi044.front.sepia.ceph.com'
2024-03-22T18:58:44.648 DEBUG:teuthology.orchestra.connection:{'hostname': 'smithi044.front.sepia.ceph.com', 'username': 'ubuntu', 'timeout': 60}
2024-03-22T18:58:44.927 DEBUG:teuthology.orchestra.remote:[Errno None] Unable to connect to port 22 on 172.21.15.44
2024-03-22T18:58:47.929 INFO:teuthology.orchestra.remote:Trying to reconnect to host 'ubuntu@smithi044.front.sepia.ceph.com'
2024-03-22T18:58:47.930 DEBUG:teuthology.orchestra.connection:{'hostname': 'smithi044.front.sepia.ceph.com', 'username': 'ubuntu', 'timeout': 60}
2024-03-22T18:58:47.998 DEBUG:teuthology.orchestra.remote:[Errno None] Unable to connect to port 22 on 172.21.15.44
2024-03-22T18:58:47.998 WARNING:teuthology.contextutil:'reconnect to {self.shortname}' reached maximum tries (5) after waiting for 30 seconds
2024-03-22T18:58:47.999 ERROR:teuthology.dispatcher.supervisor:Could not save logs
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/dispatcher/supervisor.py", line 319, in run_with_watchdog
    transfer_archives(job_info['name'], job_info['job_id'],
  File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/dispatcher/supervisor.py", line 384, in transfer_archives
    compress_logs(ctx, log_path)
  File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/misc.py", line 1371, in compress_logs
    ctx.cluster.run(
  File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/orchestra/cluster.py", line 85, in run
    procs = [remote.run(**kwargs, wait=_wait) for remote in remotes]
  File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/orchestra/cluster.py", line 85, in <listcomp>
    procs = [remote.run(**kwargs, wait=_wait) for remote in remotes]
  File "/home/teuthworker/src/git.ceph.com_teuthology_e691533f9cbb33d85b2187bba20d7102f098636d/teuthology/orchestra/remote.py", line 522, in run
    raise ConnectionError(f'Failed to reconnect to {self.shortname}')
ConnectionError: Failed to reconnect to smithi044

Actions #1

Updated by adam kraitman 19 days ago

  • Status changed from New to In Progress
  • Assignee set to adam kraitman

Hey @Laura Flores please ping me if you see this failure again

Actions #2

Updated by Aishwarya Mathuria 13 days ago

@adam kraitman I am seeing this here - /a/yuriw-2024-04-09_14:35:50-rados-wip-yuri5-testing-2024-03-21-0833-distro-default-smithi/7648756/

Actions

Also available in: Atom PDF