Project

General

Profile

Bug #4746

client: invalidate callback can deadlock

Added by Greg Farnum almost 11 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Low
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I saw this when testing the fix for #3637. We appear to be (correctly) safe against deadlocks on our own locks, but we didn't account for deadlocks on the VFS locks when doing invalidates. Which means that if we try to invalidate at the same time as the VFS is asking us to read, we deadlock.

We may need to introduce a second locking layer to deal with this, that covers draining out all VFS requests before we do our own invals...blech.
(This is probably not for Cuttlefish; I'm testing without the invalidate callback and will probably turn it off for now, sadly.)

History

#1 Updated by Sam Lang almost 11 years ago

"We may need to introduce a second locking layer to deal with this, that covers draining out all VFS requests before we do our own invals...blech."

By then isn't it too late? By the time we drain and start our invalidate, another read may have already hit the vfs layer (acquiring the lock). I must be missing something. How this handled by other fuse modules?

#2 Updated by Greg Farnum almost 11 years ago

Maybe; we didn't think this through much beyond going "yep, that's broken".

However, I think we can queue up the invalidate, but not block on it completing, then satisfy any reads that come in from the VFS until the invalidate completes. That satisfies our requirements, and while a malicious user could prevent us from invalidating that might be the way it has to be to satisfy all the invariants involved.

#3 Updated by Sage Weil almost 11 years ago

Hmm, you're right, this is a more fundamental problem.

#4 Updated by Sam Lang almost 11 years ago

The invalidate is queued in a separate thread, and when we call the invalidate, we don't have the client lock held. So other requests should be getting processed while the invalidate callback is outstanding.

That said, we haven't tested that functionality significantly yet - although I'm surprised we haven't seen a deadlock show up in the nightlies, since they've been running with the invalidate callback enabled for a while now.

#5 Updated by Greg Farnum almost 11 years ago

It's not any of our internal locking that are getting stuck; it's the VFS inode mutexes in combination with us. If I understand the order correctly:
1) VFS starts reading, locking pages as it goes
2) we have to invalidate and do the upcall
3) upcall to invalidate hits locked pages and waits
4) VFS read needs to call down into ceph-fuse to retrieve more pages (and doesn't drop the page locks, because it's still serving the read)
5) ceph-fuse won't serve up pages because we're trying to drop caps.
6) we are stuck.

We haven't seen this in our nightlies because we aren't generally doing invalidates — the only reason they'll happen on a recently-active file is if there are multiple writers, which none of our regular tests do. (Thus trying to fix up the ior things so we can run them regularly!)

Sage is discussing this on fuse-devel and at LSF/MM (did I get that acronym right?).

#6 Updated by Sage Weil almost 11 years ago

The suggestion from Maxim is to modify fuse to serialize reads and invalidate via a mutex. That ought to do the trick.. will need to play with it.

There is also a bunch of fuse writeback (!) stuff that is going to land soonish, so the picture may get a bit more complicated then!

#7 Updated by Sage Weil almost 11 years ago

pushed wip-fuse to ceph-client.git

#8 Updated by Greg Farnum about 10 years ago

  • Priority changed from High to Low

Demoted due to ceph-fuse and FUSE interface work.

#9 Updated by Zheng Yan over 9 years ago

  • Status changed from New to Resolved

client does async invalidate now

Also available in: Atom PDF