Project

General

Profile

Actions

Bug #56666

closed

mds: standby-replay daemon always removed in MDSMonitor::prepare_beacon

Added by Patrick Donnelly over 1 year ago. Updated about 1 year ago.

Status:
Resolved
Priority:
Urgent
Category:
Correctness/Safety
Target version:
% Done:

0%

Source:
Development
Tags:
backport_processed
Backport:
quincy,pacific
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If a standby-replay daemon's beacon makes it to MDSMonitor::prepare_beacon (rarely), it's automatically removed by the monitors:

2022-07-21T20:10:11.114+0000 7fdd8d195700  7 mon.a@0(leader).mds e10 prepare_update mdsbeacon(4232/d up:standby-replay seq=30 v10) v8
2022-07-21T20:10:11.114+0000 7fdd8d195700 10 mon.a@0(leader).mds e10 MDS health message (mds.?): HEALTH_ERR Metadata damage detected
2022-07-21T20:10:11.114+0000 7fdd8d195700  4 mon.a@0(leader).mds e10 mds_beacon MDS can't go back into standby after taking rank: held rank 0 while requesting state up:standby-replay
2022-07-21T20:10:11.114+0000 7fdd8d195700  1 mon.a@0(leader).mds e10 fail_mds_gid 4232 mds.d role 0

This is with a synthetic health warning injected into the beacon.

The broken code is:

https://github.com/ceph/ceph/blob/44e4999bf19a5fd0b2e80490bc74b6bdfa857655/src/mon/MDSMonitor.cc#L701-L702

A standby-replay daemon always has a rank. This check is wrong.


Related issues 2 (0 open2 closed)

Copied to CephFS - Backport #56712: pacific: mds: standby-replay daemon always removed in MDSMonitor::prepare_beaconResolvedPatrick DonnellyActions
Copied to CephFS - Backport #56713: quincy: mds: standby-replay daemon always removed in MDSMonitor::prepare_beaconResolvedPatrick DonnellyActions
Actions #1

Updated by Patrick Donnelly over 1 year ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 47218
Actions #2

Updated by Patrick Donnelly over 1 year ago

  • Status changed from Fix Under Review to Pending Backport
Actions #3

Updated by Backport Bot over 1 year ago

  • Copied to Backport #56712: pacific: mds: standby-replay daemon always removed in MDSMonitor::prepare_beacon added
Actions #4

Updated by Backport Bot over 1 year ago

  • Copied to Backport #56713: quincy: mds: standby-replay daemon always removed in MDSMonitor::prepare_beacon added
Actions #5

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions #7

Updated by Patrick Donnelly about 1 year ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF