Project

General

Profile

Actions

Bug #11226

closed

client: drops Fx cap on directories when revoking other caps

Added by Greg Farnum about 9 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

An oddity was reported in the thread "MDS has inconsistent performance" on ceph-devel, starting at http://www.spinics.net/lists/ceph-devel/msg21987.html. The user noticed that when clients were doing a file create workload in their own directory, they would start out sending only creates to the MDS, but that eventually they switched to doing a prior lookup for every create. This of course was because the client was losing its I_COMPLETE flag on the directory.

Investigation revealed that I_COMPLETE was getting cleared because the client did a create while holding the Fs cap (but not Fx) on the directory, so it dropped the Fs cap as part of the create request, and then when it got re-assigned in the response it had to clear the flag. This was odd since the client had previously held Fsx on the directory.

The drop of Fx turned out to be the result of Client::check_caps(). check_caps() was invoked when the MDS revoked some unrelated caps, and check_caps() calls send_cap(), which drops all caps which aren't included in a "retain" parameter. check_caps is filling out retain with the combined output of Inode::caps_wanted() and the caps used on the inode.
Unfortunately, caps_wanted() only returns anything at all if the inode is opened, based on the mode in use. Directories are basically never opened, and their caps aren't used much (ever?) either.

Actions #1

Updated by Greg Farnum about 9 years ago

The naive solution is to try and retain the Fx cap on directories at all times, but that has implications for how we flush stuff out.

I do have a heuristic that I think works, though: retain Fx if the directory is complete. That lets us drop the cap if it won't be helpful, and whenever we're under cache pressure we'll have to flush out the directory's children first anyway.

Actions #2

Updated by Greg Farnum about 9 years ago

  • Status changed from New to Fix Under Review
Actions #3

Updated by Greg Farnum about 9 years ago

  • Status changed from Fix Under Review to 7

Also more bugfixes around this from Zheng at https://github.com/ceph/ceph/pull/4177, in greg-fs-testing and queued up for a run right now!

Actions #4

Updated by Greg Farnum about 9 years ago

Merged Zheng's patches to master in commit:413da564d4fd479954da7feb5b27b5d8fade1100, and mine in commit:7721b224c4b0d9bb9cb81e9683f226cb5af624f4.

Actions #5

Updated by Greg Farnum about 9 years ago

  • Status changed from 7 to Resolved
Actions #6

Updated by Greg Farnum almost 8 years ago

  • Component(FS) Client added
Actions

Also available in: Atom PDF