Project

General

Profile

Bug #56666

mds: standby-replay daemon always removed in MDSMonitor::prepare_beacon

Added by Patrick Donnelly 4 months ago. Updated 3 months ago.

Status:
Pending Backport
Priority:
Urgent
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Development
Tags:
backport_processed
Backport:
quincy,pacific
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If a standby-replay daemon's beacon makes it to MDSMonitor::prepare_beacon (rarely), it's automatically removed by the monitors:

2022-07-21T20:10:11.114+0000 7fdd8d195700  7 mon.a@0(leader).mds e10 prepare_update mdsbeacon(4232/d up:standby-replay seq=30 v10) v8
2022-07-21T20:10:11.114+0000 7fdd8d195700 10 mon.a@0(leader).mds e10 MDS health message (mds.?): HEALTH_ERR Metadata damage detected
2022-07-21T20:10:11.114+0000 7fdd8d195700  4 mon.a@0(leader).mds e10 mds_beacon MDS can't go back into standby after taking rank: held rank 0 while requesting state up:standby-replay
2022-07-21T20:10:11.114+0000 7fdd8d195700  1 mon.a@0(leader).mds e10 fail_mds_gid 4232 mds.d role 0

This is with a synthetic health warning injected into the beacon.

The broken code is:

https://github.com/ceph/ceph/blob/44e4999bf19a5fd0b2e80490bc74b6bdfa857655/src/mon/MDSMonitor.cc#L701-L702

A standby-replay daemon always has a rank. This check is wrong.


Related issues

Copied to CephFS - Backport #56712: pacific: mds: standby-replay daemon always removed in MDSMonitor::prepare_beacon Resolved
Copied to CephFS - Backport #56713: quincy: mds: standby-replay daemon always removed in MDSMonitor::prepare_beacon Resolved

History

#1 Updated by Patrick Donnelly 4 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 47218

#2 Updated by Patrick Donnelly 4 months ago

  • Status changed from Fix Under Review to Pending Backport

#3 Updated by Backport Bot 4 months ago

  • Copied to Backport #56712: pacific: mds: standby-replay daemon always removed in MDSMonitor::prepare_beacon added

#4 Updated by Backport Bot 4 months ago

  • Copied to Backport #56713: quincy: mds: standby-replay daemon always removed in MDSMonitor::prepare_beacon added

#5 Updated by Backport Bot 4 months ago

  • Tags set to backport_processed

Also available in: Atom PDF