Project

General

Profile

Actions

Bug #529

closed

Cfuse: Software caused connection abort

Added by Ed Burnette over 13 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After using ceph for a few minutes it gets into a state where I can no longer access the cfuse mount point. It also seems to corrupt the file system so I have to recreate it.

I don't have a specific reproduceable sequence but it's happened several times. This morning, I was testing Ceph, copying a few hundred files into a ceph directory mounted with cfuse. Several copies worked fine. I did some mv's and chmods, no problem. Then I cd'd to the directory I just chmod'd (chmod 777) and tried to run an ls command:

> ls
ls: reading directory .: Software caused connection abort

> ls /mnt/ceph
ls: /mnt/ceph: Transport endpoint is not connected

On another machine, I can ls /mnt/ceph on any directory except the one I was using above:

> ls /mnt/ceph/rtolap
20090906
> ls /mnt/ceph/rtolap/2009*/nosuchfile
ls: /mnt/ceph/rtolap/2009*/nosuchfile: No such file or directory
> ls /mnt/ceph/rtolap/2009*
ls: reading directory /mnt/ceph/rtolap/20090906: Software caused connection abort
> ls /mnt/ceph/rtolap
ls: /mnt/ceph/rtolap: Transport endpoint is not connected

At this point even if I restart all servers, any time somebody access that directory they will crash cfuse. I can't even remove the bad directory without crashing cfuse. The only way to recover is to recreate the file system and clobber all data.

I'm using RHEL5, Linux 2.6.18, ext3 file system (not using xattr), and ceph-0.22.1 .

Actions

Also available in: Atom PDF