Project

General

Profile

Actions

Bug #3559

closed

mds: not issuing RDCACHE to exclusive client for some files

Added by Sage Weil over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

reproduced by kernel tar compile task and ceph-fuse. one part of the trigger for #3490.


Files

client_mds_sorted.a.log (6.32 MB) client_mds_sorted.a.log Sam Lang, 12/04/2012 11:24 AM
Actions #1

Updated by Sam Lang over 11 years ago

I was able to reproduce this with a libcephfs test that creates, opens and closes files, suspends the client for 20 seconds, and then tries to open the files again with mode=RDONLY. The kernel build test must have been a heavy enough workload that the client was unable to schedule the tick within 5 seconds. We might want to increase that (hold_caps_until), at the least make it a config param.

On the client, because of the suspend and the hold_caps_until timeout of 5 seconds, the client drops all caps but shared. When the mds gets open(mode=RDONLY), it sets wanted caps to crs from ceph_caps_for_mode(), so the mds is transitioning the file lock from excl to mix (Locker.cc:4020).

Now even thought the client is the loner, the gcaps_allowed for CAP_LONER in state excl->mix is rwl (mds/locks.c:112), so the client can't get the cache cap.

I think the fix for this is to set the loner cap for LOCK_EXCL_MIX to CEPH_CAP_GCACHE|CEPH_CAP_GSHARED. This makes the allowed caps for open request crswl, so the client will be able to cache.

I've pushed the test, and proposed fix to wip-3559. Also attached is the client+mds log, sorted by timestamp. The inode in question is 10000000004.

Actions #2

Updated by Sage Weil over 11 years ago

I don't think we can alloc CACHE caps in the excl->mix state because caching isn't allowed in mix. Generally speaking, and A->B state shouldn't allow anything that is not allowed in B or else we can wait indefinitely; the point of the inbetween state is to wait for a revocation or lock state or something like that.

I think the problem here is that it went to MIX even though nobody wanted to write. I pushed a patch to wip-3559b that only calls scatter_mix() if somebody wants WR; if they only want RD (as is the case here) we should move to the SYNC state because that lets everyone read (and cache).

Actions #3

Updated by Sam Lang over 11 years ago

Yeah your right - the fix I proposed doesn't work because it ends up in mix state eventually. Testing your patch.

Actions #4

Updated by Sam Lang over 11 years ago

  • Status changed from In Progress to Fix Under Review

That fixed it. With your fix it now passes my test. I cherry-picked your fix into wip-3559.

Actions #5

Updated by Sage Weil over 11 years ago

Sweet.

I'm a bit leary about exposing the caps via libcephfs.. that's an implementation detail that users shouldn't rely on. Maybe we can stick debug in the function name or something? ceph_debug_get_fd_caps()? Something along those lines?

Actions #6

Updated by Sam Lang over 11 years ago

Noah has created a libcephtools library for the admin socket stuff. Should we add it there, or a separate libcephadmin library? Alternatively, we could add a ceph_ioctl() call and bury it in there (along with the other ceph specific calls like ceph_get_local_osd()), so that we have some code in place if we want to support ioctls in fuse. Or I can just call it debug. :-)

Actions #7

Updated by Sage Weil over 11 years ago

libcephtools is wrapping the monitor interaction stuff; i think this belongs in libcephfs, we just need to advertise that it is not something users should use or rely on. putting debug in the name still seems like the simplest route to me, and is cleaner and easier than an awkward ioctl interface.

Actions #8

Updated by Sam Lang over 11 years ago

I changed those calls to ceph_debug_get_* and update the wip-3559 branch.

Actions #9

Updated by Sage Weil over 11 years ago

  • Status changed from Fix Under Review to Resolved
Actions #10

Updated by Greg Farnum over 11 years ago

  • Status changed from Resolved to In Progress

After discussion this apparently needs a bit more thought.

Actions #11

Updated by Sage Weil over 11 years ago

  • Status changed from In Progress to Resolved

Thought more about it, and I think it's right. Committed something to master that describes the logic in a big comment.

Actions

Also available in: Atom PDF