Bug #15399
closedMDS incarnation get lost after remove filesystem
0%
Description
If we remove a filesystem, then create new filesystem with old data/metadata pools. OSD may drop requests from MDS of new filesystem, because OSD thinks the requests are duplicated.
A related issue is that MDS in different filesystems can have same client ID. This is a little weird
Updated by John Spray about 8 years ago
Here's a reproducer for the incarnation issue:
https://github.com/ceph/ceph-qa-suite/tree/wip-15399
I note that when OSDs construct their objecters they just use the OSDMap epoch as the incarnation. I wonder why the MDS has a per-rank incarnation counter at all? Perhaps we can just remove it and use the MDSMap epoch instead.
Updated by Greg Farnum about 8 years ago
- MDSes A and B come up during the same epoch
- A becomes active and B becomes standby
- A fails
- B starts replaying operations
Then B needs to have an incarnation which differs from A's. The OSDs are each a distinct daemon entity (unlike the MDSes).
There may be ways to simplify it, though!
Updated by Greg Farnum about 8 years ago
So we could probably reset our network connections with an incarnation based on the last MDSMap where our role changed...I think that should work; maybe it's what you meant.
Updated by John Spray about 8 years ago
We only do objecter->set_client_incarnation(incarnation); in MDSRank::init (after we've been assigned an active role)
So epoch should be sufficient (it's always incremented when a rank assignment has changed) as long as we remember to set it when a standby replay MDS is promoted (as well as when MDSRank is initialized).
Updated by Zheng Yan about 8 years ago
I think using the MDSMap epoch as the incarnation is good idea
Updated by Greg Farnum about 8 years ago
- Status changed from New to In Progress
- Assignee set to John Spray
For Jewel: https://github.com/ceph/ceph/pull/8484
But a more comprehensive one (that works with pools shared between FSes) is still in progress.
Updated by John Spray almost 8 years ago
- Status changed from In Progress to Fix Under Review
Updated by John Spray almost 8 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler almost 8 years ago
- Copied to Backport #15732: jewel: MDS incarnation get lost after remove filesystem added
Updated by Greg Farnum almost 8 years ago
- Status changed from Pending Backport to Resolved