Project

General

Profile

Actions

Bug #14257

closed

test_reconnect_timeout failed

Added by Greg Farnum over 8 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

http://pulpito.ceph.com/gregf-2015-12-21_23:08:59-fs-master---basic-smithi/1789/

2015-12-22T22:03:27.635 INFO:tasks.cephfs_test_runner:======================================================================
2015-12-22T22:03:27.636 INFO:tasks.cephfs_test_runner:FAIL: test_reconnect_timeout (tasks.cephfs.test_client_recovery.TestClientRecovery)
2015-12-22T22:03:27.636 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2015-12-22T22:03:27.636 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2015-12-22T22:03:27.636 INFO:tasks.cephfs_test_runner:  File "/var/lib/teuthworker/src/ceph-qa-suite_master/tasks/cephfs/test_client_recovery.py", line 160, in test_reconnect_timeout
2015-12-22T22:03:27.636 INFO:tasks.cephfs_test_runner:    self.mds_reconnect_timeout, in_reconnect_for
2015-12-22T22:03:27.636 INFO:tasks.cephfs_test_runner:AssertionError: Should have been in reconnect phase for 45 but only took 14
2015-12-22T22:03:27.637 INFO:tasks.cephfs_test_runner:
Actions #1

Updated by Greg Farnum over 8 years ago

Haven't looked into this at all but I wonder if it's failing to account for all the clients pinging back early, or the control client having been timed out prior to the MDS getting paused, or something.

Actions #2

Updated by Greg Farnum over 8 years ago

  • Priority changed from Normal to Urgent
Actions #3

Updated by John Spray over 8 years ago

Test bug, Filesystem.wait_for_state is counting elapsed time as the number of times it goes through its polling loop (with a 1s sleep), which is bogus when calls to "mds dump" take a reasonable amount of time (some of these took multiple seconds for some reason). So this is failing on slow clusters. Will update test.

Actions #4

Updated by John Spray over 8 years ago

  • Status changed from New to In Progress
Actions #5

Updated by John Spray over 8 years ago

  • Status changed from In Progress to Fix Under Review
Actions #6

Updated by John Spray over 8 years ago

  • Status changed from Fix Under Review to Resolved

Fix merged into ceph-qa-suite master.

Actions

Also available in: Atom PDF