Project

General

Profile

Actions

Feature #9743

closed

Catch SSH connection errors and map to 'dead' status

Added by Zack Cerza over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
other
Tags:
Backport:
Reviewed:
Affected Versions:

Description

Currently, we are getting lots of false negatives due to network issues. If they were marked 'dead' instead of 'fail', that would be preferable

Actions #2

Updated by Zack Cerza over 9 years ago

http://qa-proxy.ceph.com/teuthology/teuthology-2014-10-13_19:00:01-rados-dumpling-distro-basic-multi/546028/teuthology.log
http://pulpito.ceph.com/teuthology-2014-10-13_19:00:01-rados-dumpling-distro-basic-multi/546028/

The patch above does what it's supposed to, but it's not enough. Something is marking the job 'fail' after it's marked 'dead':

/home/ubuntu/paddles.out.log.1.gz:2014-10-14 14:35:02,175 INFO  [paddles.controllers.jobs] Job teuthology-2014-10-13_19:00:01-rados-dumpling-distro-basic-multi/546028 status changed from queued to running
/home/ubuntu/paddles.out.log.1.gz:2014-10-14 15:00:23,819 INFO  [paddles.controllers.jobs] Job teuthology-2014-10-13_19:00:01-rados-dumpling-distro-basic-multi/546028 status changed from running to dead
/home/ubuntu/paddles.out.log.1.gz:2014-10-14 15:01:03,257 INFO  [paddles.controllers.jobs] Job teuthology-2014-10-13_19:00:01-rados-dumpling-distro-basic-multi/546028 status changed from dead to fail

Actions #3

Updated by Zack Cerza over 9 years ago

Right, it's because paddles knows that teuthology only expresses job status in terms of success bring True or False.

I'm going to have to make teuthology use status so we can look for pass, fail and dead.

Actions #5

Updated by Zack Cerza over 9 years ago

  • Status changed from 7 to Resolved
Actions

Also available in: Atom PDF