Bug #18680: multimds: cluster can assign active mds beyond max_mds during failures - CephFS - Ceph

Actions

Copy link

Bug #18680

closed

multimds: cluster can assign active mds beyond max_mds during failures

Added by Patrick Donnelly about 7 years ago. Updated about 5 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Category:

Target version:

% Done:

Source:

Development

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

MDSMonitor

Labels (FS):

multimds

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

From: http://pulpito.ceph.com/pdonnell-2017-01-25_22:42:21-multimds:thrash-wip-multimds-tests-testing-basic-mira/748363/

The thrasher sets max_mds 3 -> 1 and then deactivates mds.a and mds.b. Without waiting for mds.a and mds.b to fully stop, the thrasher kills mds.b. Eventually mds.a reactivates and takes the rank of mds.b (with 2 actives when max_mds is 3!).

Logs are on teuthology here: /home/pdonnell/748363

(There is a bug in the thrasher causing an infinite loop at the end. It is unrelated to this issue.)

Actions

Copy link

Updated by Zheng Yan about 7 years ago

commit "mon/MDSMonitor: only allow deactivating the mds with max rank" in https://github.com/ceph/ceph/pull/14550 should fix this

Actions

Copy link

Updated by Zheng Yan about 7 years ago

Status changed from New to Fix Under Review

Actions

Copy link

Updated by Zheng Yan almost 7 years ago

Status changed from Fix Under Review to Resolved

https://github.com/ceph/ceph/commit/2c08f58ee8353322a342ce043150aafc8dd9c381

Actions

Copy link

Updated by Patrick Donnelly about 5 years ago

Category deleted (90)
Labels (FS) multimds added

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Bug #18680

multimds: cluster can assign active mds beyond max_mds during failures

Updated by Zheng Yan about 7 years ago

Updated by Zheng Yan about 7 years ago

Updated by Zheng Yan almost 7 years ago

Updated by Patrick Donnelly about 5 years ago