Project

General

Profile

Actions

Bug #18680

closed

multimds: cluster can assign active mds beyond max_mds during failures

Added by Patrick Donnelly about 7 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
multimds
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

From: http://pulpito.ceph.com/pdonnell-2017-01-25_22:42:21-multimds:thrash-wip-multimds-tests-testing-basic-mira/748363/

The thrasher sets max_mds 3 -> 1 and then deactivates mds.a and mds.b. Without waiting for mds.a and mds.b to fully stop, the thrasher kills mds.b. Eventually mds.a reactivates and takes the rank of mds.b (with 2 actives when max_mds is 3!).

Logs are on teuthology here: /home/pdonnell/748363

(There is a bug in the thrasher causing an infinite loop at the end. It is unrelated to this issue.)

Actions #1

Updated by Zheng Yan about 7 years ago

commit "mon/MDSMonitor: only allow deactivating the mds with max rank" in https://github.com/ceph/ceph/pull/14550 should fix this

Actions #2

Updated by Zheng Yan about 7 years ago

  • Status changed from New to Fix Under Review
Actions #3

Updated by Zheng Yan almost 7 years ago

  • Status changed from Fix Under Review to Resolved
Actions #4

Updated by Patrick Donnelly about 5 years ago

  • Category deleted (90)
  • Labels (FS) multimds added
Actions

Also available in: Atom PDF