Actions
Bug #53811
openstandby-replay mds is removed from MDSMap unexpectedly
Status:
Pending Backport
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
% Done:
0%
Source:
Tags:
backport_processed
Backport:
quincy,pacific
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
MDSMonitor
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
In `MDSMonitor::prepare_beacon`
...
} else if ((state == MDSMap::STATE_STANDBY || state == MDSMap::STATE_STANDBY_REPLAY)
&& info.rank != MDS_RANK_NONE)
{
dout(4) << "mds_beacon MDS can't go back into standby after taking rank: "
"held rank " << info.rank << " while requesting state "
<< ceph_mds_state_name(state) << dendl;
goto evict;
}
This would evict standby-replay mds unexpectedly since standby-replay also has a rank.
Updated by Venky Shankar over 2 years ago
- Category set to Correctness/Safety
- Status changed from New to Fix Under Review
- Target version set to v17.0.0
- Backport set to pacific,octopus
- Pull request ID set to 44501
Updated by Patrick Donnelly over 2 years ago
I think you probably found this when the standby-replay daemon was "laggy" and then came back, yes?
Updated by 玮文 胡 over 2 years ago
Patrick Donnelly wrote:
I think you probably found this when the standby-replay daemon was "laggy" and then came back, yes?
No, I read the following log line from monitor:
mon.gpu024@0(leader).mds e86079 fail_mds_gid 8198787 mds.cephfs.gpu023.aetiph role 1
And this from the removed MDS:
mds.cephfs.gpu023.aetiph Updating MDS map to version 86080 from mon.2 mds.cephfs.gpu023.aetiph Map removed me [mds.cephfs.gpu023.aetiph{1:8198787} state up:standby-replay seq 1 join_fscid=2 addr [REMOVED IPs] compat {c=[1],r=[1],i=[7ff]}] from cluster; respawning! See cluster/monitor logs for details.
Updated by Venky Shankar over 1 year ago
- Status changed from Fix Under Review to Pending Backport
- Target version changed from v17.0.0 to v18.0.0
Updated by Backport Bot over 1 year ago
- Copied to Backport #57261: pacific: standby-replay mds is removed from MDSMap unexpectedly added
Updated by Backport Bot over 1 year ago
- Copied to Backport #57262: octopus: standby-replay mds is removed from MDSMap unexpectedly added
Updated by Patrick Donnelly over 1 year ago
- Tags deleted (
backport_processed) - Backport changed from pacific,octopus to quincy,pacific
Updated by Backport Bot over 1 year ago
- Copied to Backport #57370: quincy: standby-replay mds is removed from MDSMap unexpectedly added
Actions