Feature #4566
openadd an option to force nuke and restart test when stale jobs are found
0%
Description
Rather than fail the test execution when stale jobs are found can we add an option to the yaml that instructs it to nuke then re-attempt the test script?
Updated by Sage Weil about 11 years ago
- Translation missing: en.field_story_points set to 3.00
Updated by Ian Colle about 11 years ago
- Target version deleted (
v0.62a)
3767 should prevent stale jobs, but we should still do this (at a lower priority) on the off chance any still occur.
Updated by Sam Lang about 11 years ago
I can see this being useful if you're doing teuthology runs on a set of locked nodes over and over, and you don't want to manually run teuthology-nuke if you get a stale jobs error. If you're letting the locker allocate nodes for your run though, automatically cleaning up a previous run is problematic, since we may need to go and investigate that log data.
A bug like this one has come up before (and been rejected), although I can't find it in my searching. Its a bit of a philosophical issue: if a test fails, do we want to allow an automated step to blow away the failure info, or should we always require manual intervention so that we ensure the reasons for failure are always being verified?
The fixes proposed in #3637 should avoid the need for an option, but we will still need something to go back and cleanup old test directories that we have already investigated or really don't care about.