Project

General

Profile

Actions

Feature #10699

closed

make teuthology failure reasons more clear

Added by Andrew Schoen about 9 years ago. Updated about 9 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:

Description

Sometimes it's difficult to tell from the failure reason on a job if the command failed because of environment issues or if it's a legit ceph test failure.

http://pulpito.ceph.com/teuthology-2015-01-30_02:35:02-smoke-master-distro-basic-multi/731360/

As an example, the failure reason for that job was:

CommandFailedError: Command failed on mira057 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=7e5e1ea1a51637de1c9a37039f21d88aee43ee83 TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 3h /home/ubuntu/cephtest/workunit.client.0/cls/test_cls_rgw.sh'

If you take a look at the log, you'll see this log output above the traceback.

2015-01-30T02:47:49.725 INFO:tasks.workunit.client.0.mira057.stdout:[----------] Global test environment tear-down
2015-01-30T02:47:49.726 INFO:tasks.workunit.client.0.mira057.stdout:[==========] 8 tests from 1 test case ran. (18004 ms total)
2015-01-30T02:47:49.726 INFO:tasks.workunit.client.0.mira057.stdout:[  PASSED  ] 4 tests.
2015-01-30T02:47:49.726 INFO:tasks.workunit.client.0.mira057.stdout:[  FAILED  ] 4 tests, listed below:
2015-01-30T02:47:49.727 INFO:tasks.workunit.client.0.mira057.stdout:[  FAILED  ] cls_rgw.index_basic
2015-01-30T02:47:49.727 INFO:tasks.workunit.client.0.mira057.stdout:[  FAILED  ] cls_rgw.index_multiple_obj_writers
2015-01-30T02:47:49.727 INFO:tasks.workunit.client.0.mira057.stdout:[  FAILED  ] cls_rgw.index_remove_object
2015-01-30T02:47:49.728 INFO:tasks.workunit.client.0.mira057.stdout:[  FAILED  ] cls_rgw.index_suggest
2015-01-30T02:47:49.728 INFO:tasks.workunit.client.0.mira057.stdout:
2015-01-30T02:47:49.728 INFO:tasks.workunit.client.0.mira057.stdout: 4 FAILED TESTS

This was a legit ceph test failure, but it's almost impossible to know that without looking over the logs. Is there anyway we can provide a better failure message?

Actions #1

Updated by Andrew Schoen about 9 years ago

One option we discussed was adding a label kwarg to run.run so that we can print a descriptive label when a command fails.

https://github.com/ceph/teuthology/pull/423

Actions #2

Updated by Zack Cerza about 9 years ago

  • Target version set to sprint23
Actions

Also available in: Atom PDF