Project

General

Profile

Actions

Bug #15399

closed

MDS incarnation get lost after remove filesystem

Added by Zheng Yan about 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If we remove a filesystem, then create new filesystem with old data/metadata pools. OSD may drop requests from MDS of new filesystem, because OSD thinks the requests are duplicated.

A related issue is that MDS in different filesystems can have same client ID. This is a little weird


Related issues 1 (0 open1 closed)

Copied to CephFS - Backport #15732: jewel: MDS incarnation get lost after remove filesystem ResolvedAbhishek VarshneyActions
Actions #1

Updated by John Spray about 8 years ago

Here's a reproducer for the incarnation issue:
https://github.com/ceph/ceph-qa-suite/tree/wip-15399

I note that when OSDs construct their objecters they just use the OSDMap epoch as the incarnation. I wonder why the MDS has a per-rank incarnation counter at all? Perhaps we can just remove it and use the MDSMap epoch instead.

Actions #2

Updated by Greg Farnum about 8 years ago

If:
  • MDSes A and B come up during the same epoch
  • A becomes active and B becomes standby
  • A fails
  • B starts replaying operations

Then B needs to have an incarnation which differs from A's. The OSDs are each a distinct daemon entity (unlike the MDSes).

There may be ways to simplify it, though!

Actions #3

Updated by Greg Farnum about 8 years ago

So we could probably reset our network connections with an incarnation based on the last MDSMap where our role changed...I think that should work; maybe it's what you meant.

Actions #4

Updated by John Spray about 8 years ago

We only do objecter->set_client_incarnation(incarnation); in MDSRank::init (after we've been assigned an active role)

So epoch should be sufficient (it's always incremented when a rank assignment has changed) as long as we remember to set it when a standby replay MDS is promoted (as well as when MDSRank is initialized).

Actions #5

Updated by Zheng Yan about 8 years ago

I think using the MDSMap epoch as the incarnation is good idea

Actions #6

Updated by Greg Farnum about 8 years ago

  • Status changed from New to In Progress
  • Assignee set to John Spray

For Jewel: https://github.com/ceph/ceph/pull/8484

But a more comprehensive one (that works with pools shared between FSes) is still in progress.

Actions #7

Updated by John Spray almost 8 years ago

  • Status changed from In Progress to Fix Under Review
Actions #8

Updated by John Spray almost 8 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #9

Updated by Nathan Cutler almost 8 years ago

  • Backport set to jewel
Actions #10

Updated by Nathan Cutler almost 8 years ago

  • Copied to Backport #15732: jewel: MDS incarnation get lost after remove filesystem added
Actions #11

Updated by Greg Farnum almost 8 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF