Project

General

Profile

Actions

Bug #7356

closed

Kill all while loops that will never end....

Added by Sandon Van Ness over 10 years ago. Updated about 6 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Ok maybe with the one exception of one of mine that is for VPS creation... If the host machine is down then it will just go forever until someone fixes the host machine and I think hung jobs in that situation is good (forcing someone to fix the issue) rather than having runs keep failing while trying to use the vps that will never come up. This doesn't happen often but has on occasion. Just figured I should precursor with that so that loop doesn't get 'fixed' =)

Example of current problem:

2014-02-05T15:03:54.747 DEBUG:teuthology.orchestra.run:Running [10.214.138.168]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2014-02-05T15:03:56.789 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 13 pgs down; 13 pgs peering; 13 pgs stuck inactive; 13 pgs stuck unclean; 3 requests are blocked > 32 sec; mds cluster is degraded
2014-02-05T15:03:57.789 DEBUG:teuthology.orchestra.run:Running [10.214.138.168]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2014-02-05T15:04:09.061 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 13 pgs down; 13 pgs peering; 13 pgs stuck inactive; 13 pgs stuck unclean; 3 requests are blocked > 32 sec; mds cluster is degraded
2014-02-05T15:04:10.062 DEBUG:teuthology.orchestra.run:Running [10.214.138.168]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2014-02-05T15:04:17.581 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 13 pgs down; 13 pgs peering; 13 pgs stuck inactive; 13 pgs stuck unclean; 3 requests are blocked > 32 sec; mds cluster is degraded
2014-02-05T15:04:18.581 DEBUG:teuthology.orchestra.run:Running [10.214.138.168]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'
2014-02-05T15:04:37.215 DEBUG:teuthology.misc:Ceph health: HEALTH_WARN 13 pgs down; 13 pgs peering; 13 pgs stuck inactive; 13 pgs stuck unclean; 3 requests are blocked > 32 sec; mds cluster is degraded
2014-02-05T15:04:38.215 DEBUG:teuthology.orchestra.run:Running [10.214.138.168]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph health'

for hours and hours in the logs as it never recovers... it should eventually just fail.

Logs:

/var/lib/teuthworker/archive/teuthology-2014-02-04_02:30:01-upgrade:fs-next-testing-basic-vps/66668
/var/lib/teuthworker/archive/teuthology-2014-02-04_02:30:01-upgrade:fs-next-testing-basic-vps/66737

Actions

Also available in: Atom PDF