MDS incarnation get lost after remove filesystem
If we remove a filesystem, then create new filesystem with old data/metadata pools. OSD may drop requests from MDS of new filesystem, because OSD thinks the requests are duplicated.
A related issue is that MDS in different filesystems can have same client ID. This is a little weird
#1 Updated by John Spray almost 3 years ago
Here's a reproducer for the incarnation issue:
I note that when OSDs construct their objecters they just use the OSDMap epoch as the incarnation. I wonder why the MDS has a per-rank incarnation counter at all? Perhaps we can just remove it and use the MDSMap epoch instead.
#2 Updated by Greg Farnum almost 3 years ago
- MDSes A and B come up during the same epoch
- A becomes active and B becomes standby
- A fails
- B starts replaying operations
Then B needs to have an incarnation which differs from A's. The OSDs are each a distinct daemon entity (unlike the MDSes).
There may be ways to simplify it, though!
#4 Updated by John Spray almost 3 years ago
We only do objecter->set_client_incarnation(incarnation); in MDSRank::init (after we've been assigned an active role)
So epoch should be sufficient (it's always incremented when a rank assignment has changed) as long as we remember to set it when a standby replay MDS is promoted (as well as when MDSRank is initialized).