Project

General

Profile

Actions

Bug #52565

closed

MDSMonitor: handle damaged state from standby-replay

Added by Patrick Donnelly over 2 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Urgent
Category:
-
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
crash
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After the addition of join_fscid, the state change validation in this code:

https://github.com/ceph/ceph/blob/ac460ffdd57fcb3e0df3ae638d12b15cfaa9cb48/src/mon/MDSMonitor.cc#L477-L482

may not run. This may cause the rank to be marked damaged but the code does not properly handle the standby-replay daemon marking the rank damaged. It results in an assertion in sanity checks when encoding pending: the rank is both in "up" and "damaged".


Related issues 1 (0 open1 closed)

Copied to CephFS - Backport #52639: pacific: MDSMonitor: handle damaged state from standby-replayResolvedActions
Actions #1

Updated by Patrick Donnelly over 2 years ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 43122
Actions #2

Updated by Patrick Donnelly over 2 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #3

Updated by Backport Bot over 2 years ago

  • Copied to Backport #52639: pacific: MDSMonitor: handle damaged state from standby-replay added
Actions #4

Updated by Loïc Dachary over 2 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF