Project

General

Profile

Actions

Bug #16296

closed

"AssertionError: failed to recover before timeout expired" in upgrade:hammer-x-jewel-distro-basic-vps

Added by Yuri Weinstein almost 8 years ago. Updated over 7 years ago.

Status:
Can't reproduce
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/hammer-x
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2016-06-13_18:15:01-upgrade:hammer-x-jewel-distro-basic-vps/
Job: 257211
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2016-06-13_18:15:01-upgrade:hammer-x-jewel-distro-basic-vps/257211/teuthology.log

2016-06-13T23:09:43.333 INFO:tasks.thrashosds.thrasher:choose_action: min_in 4 min_out 0 min_live 2 min_dead 0
2016-06-13T23:09:43.334 INFO:tasks.thrashosds.thrasher:inject_pause on 5
2016-06-13T23:09:43.334 INFO:tasks.thrashosds.thrasher:Testing filestore_inject_stall pause injection for duration 3
2016-06-13T23:09:43.334 INFO:tasks.thrashosds.thrasher:Checking after 0, should_be_down=False
2016-06-13T23:09:43.334 INFO:teuthology.orchestra.run.vpm037:Running: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph --cluster ceph --admin-daemon /var/run/ceph/ceph-osd.5.asok config set filestore_inject_stall 3'
2016-06-13T23:23:11.027 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/var/lib/teuthworker/src/ceph-qa-suite_jewel/tasks/ceph_manager.py", line 646, in wrapper
    return func(self)
  File "/var/lib/teuthworker/src/ceph-qa-suite_jewel/tasks/ceph_manager.py", line 663, in do_sighup
    self.ceph_manager.signal_osd(osd, signal.SIGHUP, silent=True)
  File "/var/lib/teuthworker/src/ceph-qa-suite_jewel/tasks/ceph_manager.py", line 1851, in signal_osd
    self.cluster).signal(sig, silent=silent)
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 111, in signal
    self.proc.stdin.write(struct.pack('!b', sig))
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/file.py", line 381, in write
    self._write_all(data)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/file.py", line 498, in _write_all
    count = self._write(data)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/channel.py", line 1237, in _write
    self.channel.sendall(data)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/channel.py", line 761, in sendall
    sent = self.send(s)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/channel.py", line 715, in send
    return self._send(s, m)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/channel.py", line 1085, in _send
    self.transport._send_user_message(m)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/transport.py", line 1586, in _send_user_message
    self._send_message(data)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/transport.py", line 1566, in _send_message
    self.packetizer.send_message(data)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/packet.py", line 364, in send_message
    self.write_all(out)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/packet.py", line 314, in write_all
    raise EOFError()
EOFError

2016-06-13T23:23:11.030 CRITICAL:root:  File "gevent/corecext.pyx", line 360, in gevent.corecext.loop.handle_error (gevent/gevent.corecext.c:6344)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/gevent/hub.py", line 563, in handle_error
    self.print_exception(context, type, value, tb)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/gevent/hub.py", line 591, in print_exception
    traceback.print_exception(type, value, tb)
  File "/usr/lib/python2.7/traceback.py", line 124, in print_exception
    _print(file, 'Traceback (most recent call last):')
  File "/usr/lib/python2.7/traceback.py", line 13, in _print
    file.write(str+terminator)

2016-06-13T23:23:11.030 CRITICAL:root:IOError
2016-06-13T23:23:11.033 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/var/lib/teuthworker/src/ceph-qa-suite_jewel/tasks/ceph_manager.py", line 646, in wrapper
    return func(self)
  File "/var/lib/teuthworker/src/ceph-qa-suite_jewel/tasks/ceph_manager.py", line 700, in do_thrash
    self.choose_action()()
  File "/var/lib/teuthworker/src/ceph-qa-suite_jewel/tasks/ceph_manager.py", line 624, in <lambda>
    False),
  File "/var/lib/teuthworker/src/ceph-qa-suite_jewel/tasks/ceph_manager.py", line 497, in inject_pause
    self.ceph_manager.set_config(the_one, **{conf_key: duration})
  File "/var/lib/teuthworker/src/ceph-qa-suite_jewel/tasks/ceph_manager.py", line 1111, in set_config
    ['config', 'set', str(k), str(v)])
  File "/var/lib/teuthworker/src/ceph-qa-suite_jewel/tasks/ceph_manager.py", line 1077, in wait_run_admin_socket
    args, check_status=False)
  File "/var/lib/teuthworker/src/ceph-qa-suite_jewel/tasks/ceph_manager.py", line 996, in admin_socket
    check_status=check_status
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/remote.py", line 196, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 355, in run
    r.execute()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 87, in execute
    self.client.exec_command(self.command)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/client.py", line 414, in exec_command
    chan = self._transport.open_session(timeout=timeout)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/transport.py", line 703, in open_session
    timeout=timeout)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/paramiko/transport.py", line 824, in open_channel
    raise e
EOFError

2016-06-13T23:23:11.034 CRITICAL:root:  File "gevent/corecext.pyx", line 360, in gevent.corecext.loop.handle_error (gevent/gevent.corecext.c:6344)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/gevent/hub.py", line 563, in handle_error
    self.print_exception(context, type, value, tb)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/gevent/hub.py", line 591, in print_exception
    traceback.print_exception(type, value, tb)
  File "/usr/lib/python2.7/traceback.py", line 124, in print_exception
    _print(file, 'Traceback (most recent call last):')
  File "/usr/lib/python2.7/traceback.py", line 13, in _print
    file.write(str+terminator)

2016-06-13T23:23:11.035 CRITICAL:root:IOError

Also see:

yuriw@teuthology ~ [07:46:16]> grep -iE "CRITICAL:root:IOError" /a/teuthology-2016-06*-upgrade:hammer-x-jewel-distro-basic-vps/*/*.log
/a/teuthology-2016-06-05_18:15:01-upgrade:hammer-x-jewel-distro-basic-vps/239416/teuthology.log:2016-06-05T21:28:14.968 CRITICAL:root:IOError
/a/teuthology-2016-06-05_18:15:01-upgrade:hammer-x-jewel-distro-basic-vps/239416/teuthology.log:2016-06-05T21:28:15.025 CRITICAL:root:IOError
/a/teuthology-2016-06-12_18:15:02-upgrade:hammer-x-jewel-distro-basic-vps/255160/teuthology.log:2016-06-12T21:26:46.358 CRITICAL:root:IOError
/a/teuthology-2016-06-13_18:15:01-upgrade:hammer-x-jewel-distro-basic-vps/257211/teuthology.log:2016-06-13T23:23:11.030 CRITICAL:root:IOError
/a/teuthology-2016-06-13_18:15:01-upgrade:hammer-x-jewel-distro-basic-vps/257211/teuthology.log:2016-06-13T23:23:11.035 CRITICAL:root:IOError
Actions #1

Updated by Yuri Weinstein almost 8 years ago

Maybe also in
http://qa-proxy.ceph.com/teuthology/teuthology-2016-06-14_18:05:02-upgrade:hammer-x-infernalis-distro-basic-vps/259901/teuthology.log

2    381268    209231532    209612800    [0,1,3,4,5]    []
1    426944    209185856    209612800    [0,2,3,4,5]    []
0    213016    209399784    209612800    [1,2,3,4,5]    []
 sum    2214980    1255461820    1257676800

2016-06-14T19:18:56.457 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/var/lib/teuthworker/src/ceph-qa-suite_infernalis/tasks/ceph_manager.py", line 650, in wrapper
    return func(self)
  File "/var/lib/teuthworker/src/ceph-qa-suite_infernalis/tasks/ceph_manager.py", line 697, in do_thrash
    timeout=self.config.get('timeout')
  File "/var/lib/teuthworker/src/ceph-qa-suite_infernalis/tasks/ceph_manager.py", line 1626, in wait_for_recovery
    'failed to recover before timeout expired'
AssertionError: failed to recover before timeout expired

2016-06-14T19:18:56.460 CRITICAL:root:  File "gevent/corecext.pyx", line 360, in gevent.corecext.loop.handle_error (gevent/gevent.corecext.c:6344)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/gevent/hub.py", line 563, in handle_error
    self.print_exception(context, type, value, tb)
  File "/home/teuthworker/src/teuthology_master/virtualenv/local/lib/python2.7/site-packages/gevent/hub.py", line 591, in print_exception
    traceback.print_exception(type, value, tb)
  File "/usr/lib/python2.7/traceback.py", line 124, in print_exception
    _print(file, 'Traceback (most recent call last):')
  File "/usr/lib/python2.7/traceback.py", line 13, in _print
    file.write(str+terminator)

2016-06-14T19:18:56.460 CRITICAL:root:IOError
Actions #3

Updated by Yuri Weinstein almost 8 years ago

  • Subject changed from "CRITICAL:root:IOError" in upgrade:hammer-x-jewel-distro-basic-vps to "AssertionError: failed to recover before timeout expired" in upgrade:hammer-x-jewel-distro-basic-vps
Actions #5

Updated by Samuel Just over 7 years ago

  • Status changed from New to Can't reproduce
Actions

Also available in: Atom PDF