Actions
Bug #7596
closedtask/ceph_manager.py: Exceptions are being swallowed
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
I reran this job with the attach yaml:
2014-03-03 16:12:51,416.416 INFO:teuthology.orchestra.run.err:[10.214.131.27]: 0 2014-03-03 16:12:51,417.417 INFO:teuthology.orchestra.run.err:[10.214.131.27]: admin_socket: invalid command Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 390, in run result = self._run(*self.args, **self.kwargs) File "/home/ubuntu/zack/teuthology/teuthology/task/ceph_manager.py", line 362, in do_thrash self.revive_osd() File "/home/ubuntu/zack/teuthology/teuthology/task/ceph_manager.py", line 96, in revive_osd self.ceph_manager.revive_osd(osd, self.revive_timeout) File "/home/ubuntu/zack/teuthology/teuthology/task/ceph_manager.py", line 1216, in revive_osd timeout=timeout) File "/home/ubuntu/zack/teuthology/teuthology/task/ceph_manager.py", line 618, in wait_run_admin_socket raise Exception('timed out waiting for admin_socket to appear after osd.{o} restart'.format(o=osdnum)) Exception: timed out waiting for admin_socket to appear after osd.5 restart <Greenlet at 0x2b26b78: <bound method Thrasher.do_thrash of <teuthology.task.ceph_manager.Thrasher instance at 0x2b2dd88>>> failed with Exception^C
Here I waited a few minutes and hit Ctrl-C
2014-03-03 16:16:00,337.337 INFO:teuthology.task.rados:joining rados 2014-03-03 16:16:00,337.337 ERROR:teuthology.run_tasks:Manager failed: rados Traceback (most recent call last): File "/home/ubuntu/zack/teuthology/teuthology/run_tasks.py", line 84, in run_tasks suppress = manager.__exit__(*exc_info) File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__ self.gen.throw(type, value, traceback) File "/home/ubuntu/zack/teuthology/teuthology/task/rados.py", line 175, in task running.get() File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get raise self._exception CommandFailedError: Command failed on 10.214.131.31 with status 1: 'CEPH_CLIENT_ID=0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph_test_rados --op read 45 --op write 45 --op delete 10 --op snap_create 0 --op snap_remove 0 --op rollback 0 --op setattr 0 --op rmattr 0 --op watch 0 --op append 0 --max-ops 4000 --objects 500 --max-in-flight 16 --size 4000000 --min-stride-size 400000 --max-stride-size 800000 --max-seconds 0 --pool unique_pool_0'
The first traceback is not visible in the teuthology.log generated
by the job. So something very bad is going on with the way gevent is
being used.
Files
Actions