Bug #529
closedCfuse: Software caused connection abort
0%
Description
After using ceph for a few minutes it gets into a state where I can no longer access the cfuse mount point. It also seems to corrupt the file system so I have to recreate it.
I don't have a specific reproduceable sequence but it's happened several times. This morning, I was testing Ceph, copying a few hundred files into a ceph directory mounted with cfuse. Several copies worked fine. I did some mv's and chmods, no problem. Then I cd'd to the directory I just chmod'd (chmod 777) and tried to run an ls command:
> ls ls: reading directory .: Software caused connection abort > ls /mnt/ceph ls: /mnt/ceph: Transport endpoint is not connected
On another machine, I can ls /mnt/ceph on any directory except the one I was using above:
> ls /mnt/ceph/rtolap 20090906 > ls /mnt/ceph/rtolap/2009*/nosuchfile ls: /mnt/ceph/rtolap/2009*/nosuchfile: No such file or directory > ls /mnt/ceph/rtolap/2009* ls: reading directory /mnt/ceph/rtolap/20090906: Software caused connection abort > ls /mnt/ceph/rtolap ls: /mnt/ceph/rtolap: Transport endpoint is not connected
At this point even if I restart all servers, any time somebody access that directory they will crash cfuse. I can't even remove the bad directory without crashing cfuse. The only way to recover is to recreate the file system and clobber all data.
I'm using RHEL5, Linux 2.6.18, ext3 file system (not using xattr), and ceph-0.22.1 .