Bug #1302
closedmds: mds_caps_wanted vs migration
0%
Updated by Sage Weil almost 13 years ago
When a client opens a file via an MDS replica, the replica sends the auth a message letting them know which caps are wanted (this goes into map<int,int> CInode::mds_caps_wanted). When that set of wanted caps changes on the replica (client says it no longer wants caps), it sends another message.
This breaks when we race with migration:
- client opens file via replica mds0
- mds0 sends message to mds1 [auth] to update wanted
- mds1 exports inode to mds2
- mds1 receives update, ignores it
-> mds2 doesn't learn of mds0's wanted update
We either need
- some (semi-)intelligent retry when we race (probably similar to the cache expire? send to both old and new auth, wait if ambigauth)
- resend on any migration (high overhead!)
- ack wanted updates (high overhead)
Pretty sure we should mimic whatever cache expire is doing. That works very well, and the migration protocol already does a bunch of work (in the form of notify messages) to facilitate it.
Updated by Greg Farnum almost 13 years ago
- Category set to 1
- Status changed from New to Resolved
Okay, so this is actually already implemented: The replica will put the message on a waiter if the inode has an ambiguous auth. There's a minor mismatch in that the replica will send the message while the auth is in rejoin and the the auth will drop it until it's past rejoin, but that's apparently not what happened here.
However, there were a few problems in the sender function itself that I've resolved:
1) The test to drop all caps seems to have been backwards since writing; it would only drop all caps if the time-to-drop was AFTER the current time.
2) The replica would record itself as having sent a new set of wanted caps even if it didn't actually send the message (because the auth wasn't into or past the REJOIN state yet).
Both of these have been fixed and pushed as of commit:a2c761e62acdb3cff941867c224ae295cf6337b3.
Updated by Sage Weil almost 13 years ago
- Translation missing: en.field_story_points set to 5
Updated by John Spray over 7 years ago
- Project changed from Ceph to CephFS
- Category deleted (
1) - Target version deleted (
v0.32)
Bulk updating project=ceph category=mds bugs so that I can remove the MDS category from the Ceph project to avoid confusion.