Project

General

Profile

Bug #49720

mon/MDSMonitor: do not pointlessly kill standbys that are incompatible with current CompatSet

Added by Patrick Donnelly 11 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Urgent
Category:
Administration/Usability
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

During a rolling upgrade, standbys may suicide once the CompatSet for the FSMap is updated. This needlessly complicates the rolling upgrade process by requiring all standby daemons to be stopped before upgrading rank 0. We do not need to worry about an incompatible standby taking over for a file system because it will still do its compatibility check when promoted to up:replay (for the case where a higher version MDS is promoted, updates the compatset, and then fails allowing an older MDS to takeover).

Also, the compatset of each file system is updated whenever any MDS reports a new compatset. This also complicates the rolling upgrade because an upgrade of any MDS will kill rank 0 for all file systems. Only upgrade the compatset of the MDSMap if one of the ranks upgrades.


Related issues

Blocks CephFS - Feature #41566: mds: support rolling upgrades In Progress
Copied to CephFS - Backport #51983: pacific: mon/MDSMonitor: do not pointlessly kill standbys that are incompatible with current CompatSet Resolved

History

#1 Updated by Patrick Donnelly 10 months ago

  • Category set to Administration/Usability
  • Status changed from In Progress to Fix Under Review
  • Source set to Development
  • Pull request ID set to 40511

#2 Updated by Patrick Donnelly 10 months ago

#3 Updated by Patrick Donnelly 6 months ago

  • Status changed from Fix Under Review to Pending Backport

#4 Updated by Backport Bot 6 months ago

  • Copied to Backport #51983: pacific: mon/MDSMonitor: do not pointlessly kill standbys that are incompatible with current CompatSet added

#5 Updated by Loïc Dachary 5 months ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF