Project

General

Profile

Bug #11835

FuseMount.umount_wait can hang

Added by John Spray almost 9 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Testing
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Currently code in FuseMount.umount assumes that the write to /sys/fuse/connections/X/abort causes the process to terminate, then the "umount -lf" cleans up the mount. Subsequently does a blocking wait on the fuse_daemon RemoteProcess in umount_wait.

There exist cases where the write to abort does not cause the process to terminate, and in these cases FuseMount.umount_wait can hang forever.

To catch these promptly, apply a short (<1m) timeout to the wait on the fuse_daemon process. If it fails to die, emit a terrifying exception to call attention to the fact that something went internally wrong with ceph-fuse.

Associated revisions

Revision 07eb03ac (diff)
Added by John Spray almost 9 years ago

tasks/cephfs: time out on ceph-fuses that don't die

For cases where we have e.g. poked the fuse abort
file for a process, but it's still not dying. Because
this is a special class of error (unlike e.g. when
we force umount something because the network is gone)
raise the error instead of trying again to kill
the client.

Fixes: #11835
Signed-off-by: John Spray <>

History

#1 Updated by John Spray almost 9 years ago

  • Status changed from New to Fix Under Review

#2 Updated by Greg Farnum over 8 years ago

  • Status changed from Fix Under Review to Resolved

Also available in: Atom PDF