Project

General

Profile

Bug #5463

mds_thrasher: sometimes doesn't stop thrashing

Added by Greg Farnum about 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
Start date:
06/26/2013
Due date:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:

Description

This results in hung tasks so I haven't been able to make sure this is an issue with the task rather than something in the cluster, but the teuthology log doesn't have any indications I've found. It looks like in some cases the thrasher simply never stops for the main teuthology thread to join it.

I've been told this exception shouldn't be a problem, but maybe it's causing the execution order to get a little broken? Or maybe it's something else entirely.

2013-06-25T22:40:57.061 INFO:teuthology.task.workunit:Deleted dir /home/ubuntu/cephtest/46383/mnt.0/client.0
2013-06-25T22:40:57.061 DEBUG:teuthology.orchestra.run:Running [10.214.132.11]: 'rmdir -- /home/ubuntu/cephtest/46383/mnt.0'
2013-06-25T22:40:57.076 INFO:teuthology.orchestra.run.err:rmdir: failed to remove `/home/ubuntu/cephtest/46383/mnt.0': Device or resource busy
2013-06-25T22:40:57.076 DEBUG:teuthology.task.workunit:Caught an execption deleting dir /home/ubuntu/cephtest/46383/mnt.0
2013-06-25T22:40:57.076 DEBUG:teuthology.run_tasks:Unwinding manager <contextlib.GeneratorContextManager object at 0x2448cd0>
2013-06-25T22:40:57.076 INFO:teuthology.task.ceph-fuse:Unmounting ceph-fuse clients...
2013-06-25T22:40:57.077 DEBUG:teuthology.orchestra.run:Running [10.214.132.11]: 'sudo fusermount -u /home/ubuntu/cephtest/46383/mnt.0'
2013-06-25T22:40:57.164 INFO:teuthology.task.ceph-fuse.ceph-fuse.0.err:ceph-fuse[4281]: fuse finished with error 0
2013-06-25T22:40:59.930 INFO:teuthology.task.mds_thrash.mds_thrasher.failure_group.[a, b-s-a]:reviving mds.a

(That's from /a/teuthology-2013-06-25_20:00:50-fs-cuttlefish-testing-basic/46383/teuthology.log)

History

#1 Updated by Zack Cerza over 5 years ago

  • Status changed from New to Resolved

Workunits have a timeout now, so this ought to be fixed. If not, please reopen or file a new ticket.

Also available in: Atom PDF