Project

General

Profile

Actions

Feature #20606

closed

mds: improve usability of cluster rank manipulation and setting cluster up/down

Added by Patrick Donnelly almost 7 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Category:
Administration/Usability
Target version:
% Done:

100%

Source:
Development
Tags:
Backport:
Reviewed:
Affected Versions:
Component(FS):
MDSMonitor
Labels (FS):
multimds
Pull request ID:

Description

Right now the procedure for bringing down a cluster is:

ceph fs set cephfs_a cluster_down 1
ceph mds fail 1:1 # rank 1 of 2
ceph mds fail 1:0 # rank 0 of 2
ceph status
  cluster:
    id:     4ef94796-a652-4e0f-ad4e-8f3aaa9b9d18
    health: HEALTH_ERR
            mds ranks 0,1 have failed
            mds cluster is degraded

  services:
    mon: 3 daemons, quorum a,b,c
    mgr: x(active)
    mds: 0/2/2 up, 2 up:standby, 2 failed
    osd: 3 osds: 3 up, 3 in

  data:
    pools:   2 pools, 16 pgs
    objects: 39 objects, 3558 bytes
    usage:   3265 MB used, 27646 MB / 30911 MB avail
    pgs:     16 active+clean

This leaves the journal unflushed and client sessions half-open. Also, disturbing notices are in `ceph status` showing "failed" mdss and unhelpful health warnings.

I would recommend several changes outlined in this issue's sub-tasks.


Subtasks 5 (0 open5 closed)

Feature #20607: MDSMonitor: change "mds deactivate" to clearer "mds rejoin"RejectedDouglas Fuller07/12/2017

Actions
Feature #20608: MDSMonitor: rename `ceph fs set <fs_name> cluster_down` to `ceph fs set <fs_name> joinable`ResolvedDouglas Fuller07/12/2017

Actions
Feature #20609: MDSMonitor: add new command `ceph fs set <fs_name> down` to bring the cluster downResolvedDouglas Fuller07/12/2017

Actions
Feature #20610: MDSMonitor: add new command to shrink the cluster in an automated wayResolvedDouglas Fuller07/12/2017

Actions
Subtask #20864: kill allow_multimdsResolvedDouglas Fuller07/31/2017

Actions
Actions

Also available in: Atom PDF