Actions
Bug #7596
closedtask/ceph_manager.py: Exceptions are being swallowed
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
% Done:
0%
Source:
other
Tags:
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):
Description
I reran this job with the attach yaml:
2014-03-03 16:12:51,416.416 INFO:teuthology.orchestra.run.err:[10.214.131.27]: 0 2014-03-03 16:12:51,417.417 INFO:teuthology.orchestra.run.err:[10.214.131.27]: admin_socket: invalid command Traceback (most recent call last): File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 390, in run result = self._run(*self.args, **self.kwargs) File "/home/ubuntu/zack/teuthology/teuthology/task/ceph_manager.py", line 362, in do_thrash self.revive_osd() File "/home/ubuntu/zack/teuthology/teuthology/task/ceph_manager.py", line 96, in revive_osd self.ceph_manager.revive_osd(osd, self.revive_timeout) File "/home/ubuntu/zack/teuthology/teuthology/task/ceph_manager.py", line 1216, in revive_osd timeout=timeout) File "/home/ubuntu/zack/teuthology/teuthology/task/ceph_manager.py", line 618, in wait_run_admin_socket raise Exception('timed out waiting for admin_socket to appear after osd.{o} restart'.format(o=osdnum)) Exception: timed out waiting for admin_socket to appear after osd.5 restart <Greenlet at 0x2b26b78: <bound method Thrasher.do_thrash of <teuthology.task.ceph_manager.Thrasher instance at 0x2b2dd88>>> failed with Exception^C
Here I waited a few minutes and hit Ctrl-C
2014-03-03 16:16:00,337.337 INFO:teuthology.task.rados:joining rados 2014-03-03 16:16:00,337.337 ERROR:teuthology.run_tasks:Manager failed: rados Traceback (most recent call last): File "/home/ubuntu/zack/teuthology/teuthology/run_tasks.py", line 84, in run_tasks suppress = manager.__exit__(*exc_info) File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__ self.gen.throw(type, value, traceback) File "/home/ubuntu/zack/teuthology/teuthology/task/rados.py", line 175, in task running.get() File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get raise self._exception CommandFailedError: Command failed on 10.214.131.31 with status 1: 'CEPH_CLIENT_ID=0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph_test_rados --op read 45 --op write 45 --op delete 10 --op snap_create 0 --op snap_remove 0 --op rollback 0 --op setattr 0 --op rmattr 0 --op watch 0 --op append 0 --max-ops 4000 --objects 500 --max-in-flight 16 --size 4000000 --min-stride-size 400000 --max-stride-size 800000 --max-seconds 0 --pool unique_pool_0'
The first traceback is not visible in the teuthology.log generated
by the job. So something very bad is going on with the way gevent is
being used.
Files
Updated by Sage Weil over 9 years ago
- Status changed from New to Can't reproduce
Actions