Project

General

Profile

Actions

Feature #17309

closed

qa: mon_thrash test for CephFS

Added by John Spray over 7 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Testing
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
Reviewed:
Affected Versions:
Component(FS):
qa-suite
Labels (FS):
task(easy), task(intern)
Pull request ID:

Description

We don't currently have anything that thrashes the mons while running CephFS. It would be useful to run with a thrasher and assert that the MDSs do not get failed incorrectly when mon failures and delays occur.

Currently it seems like there are cases where glitches in mon availability can cause MDSMonitor to trip its timeouts and incorrectly take healthy MDS daemons offline (http://tracker.ceph.com/issues/17308)


Related issues 2 (0 open2 closed)

Related to CephFS - Bug #17308: MDSMonitor should tolerate paxos delays without failing daemons (Was: Unexplained delay forwarding message between mons)ResolvedJohn Spray09/20/2016

Actions
Related to CephFS - Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemons: set(['mds.dir_split']))Can't reproduce04/20/2017

Actions
Actions #1

Updated by Patrick Donnelly about 6 years ago

  • Subject changed from mon_thrash test for CephFS to qa: mon_thrash test for CephFS
  • Assignee set to Patrick Donnelly
  • Target version set to v13.0.0
Actions #2

Updated by Patrick Donnelly about 6 years ago

  • Related to Bug #17308: MDSMonitor should tolerate paxos delays without failing daemons (Was: Unexplained delay forwarding message between mons) added
Actions #3

Updated by Patrick Donnelly about 6 years ago

  • Related to Bug #19706: Laggy mon daemons causing MDS failover (symptom: failed to set counters on mds daemons: set(['mds.dir_split'])) added
Actions #4

Updated by Patrick Donnelly almost 6 years ago

  • Priority changed from Normal to High
  • Target version changed from v13.0.0 to v14.0.0
Actions #5

Updated by Patrick Donnelly about 5 years ago

  • Assignee changed from Patrick Donnelly to Jos Collin
  • Target version changed from v14.0.0 to v15.0.0
  • Start date deleted (09/20/2016)
  • Source changed from other to Development
  • Labels (FS) task(easy), task(intern) added
Actions #6

Updated by Jos Collin about 5 years ago

  • Status changed from New to In Progress
Actions #7

Updated by Jos Collin about 5 years ago

  • Pull request ID set to 27073
Actions #8

Updated by Jos Collin about 5 years ago

  • Status changed from In Progress to Fix Under Review
Actions #9

Updated by Patrick Donnelly almost 5 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF