Fix #11590
closedMDSMonitor: handle MDSBeacon messages properly
100%
Description
We discovered while investigating #11481 that the MDSMonitor simply does not handle MDSBeacon messages appropriately. It's supposed to send back an MMDSBeacon message in response to every one it receives, but in fact it only sends back responses to those that are ignored! (This is in preprocess_beacon)
The other paths generally stick the incoming MMDSBeacon inside of a C_Updated context that waits until a map commit has happened, but this context doesn't do anything useful with them. :(
This is made significantly worse because beacon messages have to go to the leader, and so are often forwarded. On an election, the peon will then forward the message to the leader again, and if the leader accepts it then we can do some horrible warps back in time. (That said, these outdated beacons ought to be rejected based on the seq and info.state_seq values, but it seems they aren't always, as in the referenced ticket.)
Updated by Greg Farnum almost 9 years ago
This might also be the cause of the very small number of leaked messages I think Sam or Joao mentioned to me.
Updated by Kefu Chai almost 9 years ago
That said, these outdated beacons ought to be rejected based on the seq and info.state_seq values, but it seems they aren't always, as in the referenced ticket.
after the old MDS died, a new stand-by MDS replaced it. so MDSMonitor erased all the history (e.g. pending_mdsmap.mds_info
, last_beacon
) related to the old gid when it did the housekeeping. that's why after 4123 replaced the 4111. MDSMonitor forgot everything about 4111, including the last msg seq# from it. and the peon monitor kept resending the mdsbeacon message (mdsbeacon(4111/a-s up:boot seq 1 v0)) from the dead MDS tirelessly. this outdated message misled the lead mon and eventually brought down the innocent new MDS.
Updated by Greg Farnum almost 9 years ago
D'oh, well spotted. We could maybe check for matching entity_inst_t when handling beacons, but that gets complicated with some of our naming and takeover logic.
For now just replying to beacons and stopping the retransmission is fine.
Updated by Kefu Chai almost 9 years ago
- Status changed from New to Fix Under Review
Updated by Kefu Chai almost 9 years ago
see the discussions on https://github.com/ceph/ceph/pull/4702
per greg,
So we need some kind of state message to tell peons to drop forwarded messages because they've been received and aren't getting a response.
so we can not reply to mdsbeacons ignored by the leader.
Updated by Kefu Chai almost 9 years ago
- % Done changed from 0 to 60
the 2nd pull request for this issue
Updated by Greg Farnum almost 9 years ago
- Status changed from Fix Under Review to Resolved
Updated by Kefu Chai almost 9 years ago
- Status changed from Resolved to Pending Backport
- Backport set to firefly, hammer
Updated by Nathan Cutler over 8 years ago
- Status changed from Pending Backport to Resolved
- % Done changed from 60 to 100