Project

General

Profile

Actions

Bug #23567

closed

MDSMonitor: successive changes to max_mds can allow hole in ranks

Added by Patrick Donnelly about 6 years ago. Updated almost 6 years ago.

Status:
Resolved
Priority:
High
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

With 3 MDS, approximately this sequence:

   70  ceph fs set cephfs max_mds 2
   71  watch ceph status
   72  ceph fs set cephfs max_mds 1
   73  ceph mds deactivate 1
   74  ceph fs set cephfs max_mds 3
   75  watch ceph status
   76  ceph fs set cephfs max_mds 2

can result in:

  cluster:
    id:     b602e734-2a9a-4a4a-93c0-64018b991968
    health: HEALTH_OK

  services:
    mon: 3 daemons, quorum li638-46,li889-56,li432-208
    mgr: li569-87(active), standbys: li227-22
    mds: cephfs-2/2/2 up  {0=li89-229=up:active,2=li896-69=up:active}, 1 up:standby
    osd: 32 osds: 32 up, 32 in

  data:
    pools:   2 pools, 640 pgs
    objects: 57 objects, 6490 bytes
    usage:   6984 MB used, 497 GB / 504 GB avail
    pgs:     640 active+clean

We should not resize the cluster if any MDS is stopping.

Actions #1

Updated by Douglas Fuller about 6 years ago

  • Status changed from New to Need More Info
  • Assignee changed from Douglas Fuller to Patrick Donnelly

Was this before or after https://github.com/ceph/ceph/pull/16608 ?

Actions #2

Updated by Patrick Donnelly about 6 years ago

Doug, I tested with master but I believe it also happened with your PR. I can't remember.

Actions #3

Updated by Patrick Donnelly about 6 years ago

  • Status changed from Need More Info to Fix Under Review
  • Backport deleted (luminous)
Actions #4

Updated by Patrick Donnelly almost 6 years ago

  • Status changed from Fix Under Review to Resolved
  • Start date deleted (04/05/2018)
Actions

Also available in: Atom PDF