Bug #52874
closedMonitor might crash after upgrade from ceph to 16.2.6
0%
Description
The following assertion might pop up
void FSMap::sanity() const
{
...
if (info.state != MDSMap::STATE_STANDBY_REPLAY) {
...
} else {
ceph_assert(fs->mds_map.allows_standby_replay());
}
when allow-standby-replay flag is set to false but some MDS-es are still running in standby-replay mode.
The thing is that prior to Pacific setting the flag doesn't enforce MDS going out of the mode.
Hence one might put the cluster (and relevant MDS map) in an inconsistent state which triggers the monitor assertion on the upgrade.
Neither upgrade manual requires manual standby-replay MDS disablement PRIOR to monitor upgrade. According to the spec the latter to be performed at stage 2 while actions on MDS are at stage 5:
2.Upgrade monitors by installing the new packages and restarting the monitor daemons. For example, on each monitor host,:
...
5. Upgrade all CephFS MDS daemons. For each CephFS file system,
1. Disable standby_replay: