Actions
Bug #23567
closedMDSMonitor: successive changes to max_mds can allow hole in ranks
Status:
Resolved
Priority:
High
Assignee:
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
With 3 MDS, approximately this sequence:
70 ceph fs set cephfs max_mds 2 71 watch ceph status 72 ceph fs set cephfs max_mds 1 73 ceph mds deactivate 1 74 ceph fs set cephfs max_mds 3 75 watch ceph status 76 ceph fs set cephfs max_mds 2
can result in:
cluster: id: b602e734-2a9a-4a4a-93c0-64018b991968 health: HEALTH_OK services: mon: 3 daemons, quorum li638-46,li889-56,li432-208 mgr: li569-87(active), standbys: li227-22 mds: cephfs-2/2/2 up {0=li89-229=up:active,2=li896-69=up:active}, 1 up:standby osd: 32 osds: 32 up, 32 in data: pools: 2 pools, 640 pgs objects: 57 objects, 6490 bytes usage: 6984 MB used, 497 GB / 504 GB avail pgs: 640 active+clean
We should not resize the cluster if any MDS is stopping.
Updated by Douglas Fuller about 6 years ago
- Status changed from New to Need More Info
- Assignee changed from Douglas Fuller to Patrick Donnelly
Was this before or after https://github.com/ceph/ceph/pull/16608 ?
Updated by Patrick Donnelly about 6 years ago
Doug, I tested with master but I believe it also happened with your PR. I can't remember.
Updated by Patrick Donnelly about 6 years ago
- Status changed from Need More Info to Fix Under Review
- Backport deleted (
luminous)
https://github.com/ceph/ceph/pull/16608
QA and fix here.
Updated by Patrick Donnelly almost 6 years ago
- Status changed from Fix Under Review to Resolved
- Start date deleted (
04/05/2018)
Actions