Project

General

Profile

Actions

Bug #58697

open

Some teuthology jobs are getting scheduled as "unknown"

Added by Laura Flores about 1 year ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Assignee:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description


Related issues 1 (1 open0 closed)

Related to sepia - Bug #58696: Entire teuthology runs are dyingNew

Actions
Actions #1

Updated by Laura Flores about 1 year ago

  • Related to Bug #58696: Entire teuthology runs are dying added
Actions #2

Updated by Laura Flores about 1 year ago

Note from Zack regarding this issue:

looking at paddles logs again, I see this:
Feb 09 16:35:16 pulpito sudo[32458]:     root : TTY=pts/0 ; PWD=/root ; USER=root ; COMMAND=/usr/bin/docker exec -it fa31f1474a1a sh -c pecan expire_jobs config.py -q 0 -r 600
and I am pretty sure the -q 0 there would cause all queued jobs to be marked dead in paddles - but the expire_jobs tool does not touch the beanstalkd queue, which is the source of truth. this is a bizarre situation, but i believe those jobs are actually still queued and will run eventually
Actions #3

Updated by Zack Cerza about 1 year ago

Also, teuthology-kill only touches jobs that are actively running or in the 'queued' state. teuthology-queue has more flexible options for dequeueing jobs.

Actions #4

Updated by Laura Flores about 1 year ago

Zack Cerza wrote:

Also, teuthology-kill only touches jobs that are actively running or in the 'queued' state. teuthology-queue has more flexible options for dequeueing jobs.

Thanks Zack, I didn't know this was an option!

teuthology-queue options for those interested:

lflores@teuthology:~$ ./teuthology/virtualenv/bin/teuthology-queue --help
/home/lflores/teuthology/virtualenv/lib/python3.6/site-packages/paramiko/transport.py:33: CryptographyDeprecationWarning: Python 3.6 is no longer supported by the Python core team. Therefore, support for it is deprecated in cryptography and will be removed in a future release.
  from cryptography.hazmat.backends import default_backend
usage: teuthology-queue -h
       teuthology-queue [-s|-d|-f] -m MACHINE_TYPE
       teuthology-queue [-r] -m MACHINE_TYPE
       teuthology-queue -m MACHINE_TYPE -D PATTERN
       teuthology-queue -p SECONDS [-m MACHINE_TYPE]

List Jobs in queue.
If -D is passed, then jobs with PATTERN in the job name are deleted from the
queue.

Arguments:
  -m, --machine_type MACHINE_TYPE [default: multi]
                        Which machine type queue to work on.

optional arguments:
  -h, --help            Show this help message and exit
  -D, --delete PATTERN  Delete Jobs with PATTERN in their name
  -d, --description     Show job descriptions
  -r, --runs            Only show run names
  -f, --full            Print the entire job config. Use with caution.
  -s, --status          Prints the status of the queue
  -p, --pause SECONDS   Pause queues for a number of seconds. A value of 0
                        will unpause. If -m is passed, pause that queue,
                        otherwise pause all queues.

Actions

Also available in: Atom PDF