Actions
Feature #10699
closedmake teuthology failure reasons more clear
% Done:
0%
Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:
Description
Sometimes it's difficult to tell from the failure reason on a job if the command failed because of environment issues or if it's a legit ceph test failure.
http://pulpito.ceph.com/teuthology-2015-01-30_02:35:02-smoke-master-distro-basic-multi/731360/
As an example, the failure reason for that job was:
CommandFailedError: Command failed on mira057 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=7e5e1ea1a51637de1c9a37039f21d88aee43ee83 TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/cls/test_cls_rgw.sh'
If you take a look at the log, you'll see this log output above the traceback.
2015-01-30T02:47:49.725 INFO:tasks.workunit.client.0.mira057.stdout:[----------] Global test environment tear-down 2015-01-30T02:47:49.726 INFO:tasks.workunit.client.0.mira057.stdout:[==========] 8 tests from 1 test case ran. (18004 ms total) 2015-01-30T02:47:49.726 INFO:tasks.workunit.client.0.mira057.stdout:[ PASSED ] 4 tests. 2015-01-30T02:47:49.726 INFO:tasks.workunit.client.0.mira057.stdout:[ FAILED ] 4 tests, listed below: 2015-01-30T02:47:49.727 INFO:tasks.workunit.client.0.mira057.stdout:[ FAILED ] cls_rgw.index_basic 2015-01-30T02:47:49.727 INFO:tasks.workunit.client.0.mira057.stdout:[ FAILED ] cls_rgw.index_multiple_obj_writers 2015-01-30T02:47:49.727 INFO:tasks.workunit.client.0.mira057.stdout:[ FAILED ] cls_rgw.index_remove_object 2015-01-30T02:47:49.728 INFO:tasks.workunit.client.0.mira057.stdout:[ FAILED ] cls_rgw.index_suggest 2015-01-30T02:47:49.728 INFO:tasks.workunit.client.0.mira057.stdout: 2015-01-30T02:47:49.728 INFO:tasks.workunit.client.0.mira057.stdout: 4 FAILED TESTS
This was a legit ceph test failure, but it's almost impossible to know that without looking over the logs. Is there anyway we can provide a better failure message?
Actions