Project

General

Profile

Feature #20606

Updated by Patrick Donnelly almost 7 years ago

Right now the procedure for bringing down a cluster is: 

 <pre> 
 ceph fs set cephfs_a cluster_down 1 
 ceph mds fail 1:1 # rank 1 of 2 
 ceph mds fail 1:0 # rank 0 of 2 
 ceph status 
   cluster: 
     id:       4ef94796-a652-4e0f-ad4e-8f3aaa9b9d18 
     health: HEALTH_ERR 
             mds ranks 0,1 have failed 
             mds cluster is degraded 
 
   services: 
     mon: 3 daemons, quorum a,b,c 
     mgr: x(active) 
     mds: 0/2/2 up, 2 up:standby, 2 failed 
     osd: 3 osds: 3 up, 3 in 
 
   data: 
     pools:     2 pools, 16 pgs 
     objects: 39 objects, 3558 bytes 
     usage:     3265 MB used, 27646 MB / 30911 MB avail 
     pgs:       16 active+clean 
 </pre> 

 This leaves the journal unflushed and client sessions half-open. Also, disturbing notices are in `ceph status` showing "failed" mdss and unhelpful health warnings. 

 I would recommend several changes outlined changes: 

 * `ceph mds deactivate` renamed `ceph mds rejoin` to make it clear an MDS is leaving and then rejoining the metadata cluster. It is not "deactivating" which to me means shutting down gently and not coming back. Running rejoin on a rank which will not result in the cluster shrinking (because max_mds has not changed) is now okay. 
 * `ceph fs set <fs_name> cluster_down` renamed to `ceph fs set <fs_name> joinable` (behavior stays the same). 
 * New `ceph fs set <fs_name> down true/false` to bring the cluster down. This will cause the MDSMonitor to start stopping ranks (what deactivate does) beginning at the highest rank. Only one rank is stopped at a time. This implicitly means the cluster is not joinable. 
 * Deprecate `ceph fs set <fs_name> max_mds <max_mds>`. New `ceph fs set <fs_name> ranks <max_mds>` sets max_mds and begins stopping ranks with the same logic as `ceph fs set <fs_name> down`. This will rely on a new internal MDSMap bool field "shrink_to_fit". (Probably setting max_mds through the deprected command unsets shrink_to_fit. Setting <ranks> will turn it on. The user cannot change this issue's sub-tasks. field explicitly.)

Back