Project

General

Profile

Actions

Bug #58724

open

teuthology jobs in "running" status for 15+ hours

Added by Laura Flores about 1 year ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Problem:
We had some teuthology jobs from several runs in "running" status for 15+ hours. Each of these jobs had experienced a failure, but teuthology did not mark them as dead after the failure occurred.

Example:
http://pulpito.front.sepia.ceph.com/lflores-2023-02-14_01:12:29-rados-main-distro-default-smithi/

Solution:
The issue was that the jobs really were dead, but the paddles db wasn't aware of that, likely because the dispatcher had died. The solution was to tell paddles that the jobs were actually dead with this command:

teuthology-report -D -r $run
Actions

Also available in: Atom PDF