Project

General

Profile

Actions

Bug #1302

closed

mds: mds_caps_wanted vs migration

Added by Sage Weil almost 13 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Actions #1

Updated by Greg Farnum almost 13 years ago

More detail, please?

Actions #2

Updated by Sage Weil almost 13 years ago

When a client opens a file via an MDS replica, the replica sends the auth a message letting them know which caps are wanted (this goes into map<int,int> CInode::mds_caps_wanted). When that set of wanted caps changes on the replica (client says it no longer wants caps), it sends another message.

This breaks when we race with migration:

- client opens file via replica mds0
- mds0 sends message to mds1 [auth] to update wanted
- mds1 exports inode to mds2
- mds1 receives update, ignores it
-> mds2 doesn't learn of mds0's wanted update

We either need
- some (semi-)intelligent retry when we race (probably similar to the cache expire? send to both old and new auth, wait if ambigauth)
- resend on any migration (high overhead!)
- ack wanted updates (high overhead)

Pretty sure we should mimic whatever cache expire is doing. That works very well, and the migration protocol already does a bunch of work (in the form of notify messages) to facilitate it.

Actions #3

Updated by Sage Weil almost 13 years ago

  • Category deleted (1)
Actions #4

Updated by Sage Weil almost 13 years ago

  • Assignee deleted (Sage Weil)
Actions #5

Updated by Greg Farnum almost 13 years ago

  • Assignee set to Greg Farnum
Actions #6

Updated by Greg Farnum almost 13 years ago

  • Category set to 1
  • Status changed from New to Resolved

Okay, so this is actually already implemented: The replica will put the message on a waiter if the inode has an ambiguous auth. There's a minor mismatch in that the replica will send the message while the auth is in rejoin and the the auth will drop it until it's past rejoin, but that's apparently not what happened here.

However, there were a few problems in the sender function itself that I've resolved:
1) The test to drop all caps seems to have been backwards since writing; it would only drop all caps if the time-to-drop was AFTER the current time.
2) The replica would record itself as having sent a new set of wanted caps even if it didn't actually send the message (because the auth wasn't into or past the REJOIN state yet).

Both of these have been fixed and pushed as of commit:a2c761e62acdb3cff941867c224ae295cf6337b3.

Actions #7

Updated by Sage Weil almost 13 years ago

  • Translation missing: en.field_story_points set to 5
Actions #8

Updated by John Spray over 7 years ago

  • Project changed from Ceph to CephFS
  • Category deleted (1)
  • Target version deleted (v0.32)

Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.

Actions

Also available in: Atom PDF