Project

General

Profile

Actions

Bug #13714

closed

Segmentation fault accessing file using fuse mount

Added by Eric Eastman over 8 years ago. Updated about 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
infernalis
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On a test Ceph cluster running Ceph v9.1.0 with the 4.3.0 Kernel on Trusty, running the Ceph File System with snapshots enabled, I was attempting to read a file within a snapshot when the I received a read error, followed by a Transport endpoint is not connected error as shown:

root: /cephfs/.snap/snapshot.2015-11-06_14_17_01-1446837421/top/dfgw02/2015-11-06_14_08_38/2/2009/month_4/day_3/hour_2# md5sum -c data_file.md5 
md5sum: data_file.md5: read error
root:/cephfs/.snap/snapshot.2015-11-06_14_17_01-1446837421/top/dfgw02/2015-11-06_14_08_38/2/2009/month_4/day_3/hour_2# ls -lrta
ls: cannot open directory .: Transport endpoint is not connected
root:/cephfs/.snap/snapshot.2015-11-06_14_17_01-1446837421/top/dfgw02/2015-11-06_14_08_38/2/2009/month_4/day_3/hour_2# df
df: ‘/cephfs’: Transport endpoint is not connected

In ceph-client.cephfs.log.1:

2015-11-06 17:03:36.713548 7f96857fa700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f96857fa700

 ceph version 9.1.0 (3be81ae6cf17fcf689cd6f187c4615249fea4f61)
 1: (()+0x25aa1a) [0x55fcb870ca1a]
 2: (()+0x10340) [0x7f96ae901340]
 3: (Client::check_pool_perm(Inode*, int)+0x335) [0x55fcb864bac5]
 4: (Client::get_caps(Inode*, int, int, int*, long)+0x2f) [0x55fcb864cc1f]
 5: (Client::_read(Fh*, long, unsigned long, ceph::buffer::list*)+0x205) [0x55fcb8663f25]
 6: (Client::ll_read(Fh*, long, long, ceph::buffer::list*)+0x8f) [0x55fcb86648ff]
 7: (()+0x17455b) [0x55fcb862655b]
 8: (()+0x1481e) [0x7f96aef8381e]
 9: (()+0x1522b) [0x7f96aef8422b]
 10: (()+0x11e49) [0x7f96aef80e49]
 11: (()+0x8182) [0x7f96ae8f9182]
 12: (clone()+0x6d) [0x7f96ad48347d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- begin dump of recent events ---
-10000> 2015-11-06 17:03:11.936372 7f96867fc700  3 client.5076 ll_forget 1000003028f 1
 -9999> 2015-11-06 17:03:11.936376 7f96867fc700  3 client.5076 ll_getattr 10000030291.14
 -9998> 2015-11-06 17:03:11.936378 7f96867fc700  3 client.5076 ll_getattr 10000030291.14 = 0
 -9997> 2015-11-06 17:03:11.936382 7f96867fc700  3 client.5076 ll_forget 10000030291 1
 -9996> 2015-11-06 17:03:11.936385 7f96867fc700  3 client.5076 ll_lookup 0x7f96940f2d20 2009
 -9995> 2015-11-06 17:03:11.936387 7f96867fc700  3 client.5076 ll_lookup 0x7f96940f2d20 2009 -> 0 (10000030294)
...
  -13> 2015-11-06 17:03:36.712522 7f96867fc700  3 client.5076 ll_getattr 10000032da8.14 = 0
   -12> 2015-11-06 17:03:36.712525 7f96867fc700  3 client.5076 ll_forget 10000032da8 1
   -11> 2015-11-06 17:03:36.712531 7f96867fc700  3 client.5076 ll_open 10000032da8.14 32768
   -10> 2015-11-06 17:03:36.712540 7f96867fc700  5 client.5076 open success, fh is 0x7f9678053aa0 combined IMMUTABLE SNAP caps pAsLsXsFscr
    -9> 2015-11-06 17:03:36.712580 7f96867fc700  3 client.5076 ll_open 10000032da8.14 32768 = 0 (0x7f9678053aa0)
    -8> 2015-11-06 17:03:36.712587 7f96867fc700  3 client.5076 ll_forget 10000032da8 1
    -7> 2015-11-06 17:03:36.712599 7f9685ffb700  3 client.5076 ll_getattr 10000032da8.14
    -6> 2015-11-06 17:03:36.712605 7f9685ffb700  3 client.5076 ll_getattr 10000032da8.14 = 0
    -5> 2015-11-06 17:03:36.712609 7f9685ffb700  3 client.5076 ll_forget 10000032da8 1
    -4> 2015-11-06 17:03:36.712624 7f9684ff9700  3 client.5076 ll_getattr 10000032da8.14
    -3> 2015-11-06 17:03:36.712633 7f9684ff9700  3 client.5076 ll_getattr 10000032da8.14 = 0
    -2> 2015-11-06 17:03:36.712638 7f9684ff9700  3 client.5076 ll_forget 10000032da8 1
    -1> 2015-11-06 17:03:36.712650 7f96857fa700  3 client.5076 ll_read 0x7f9678053aa0 10000032da8  0~4096
     0> 2015-11-06 17:03:36.713548 7f96857fa700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f96857fa700

 ceph version 9.1.0 (3be81ae6cf17fcf689cd6f187c4615249fea4f61)
 1: (()+0x25aa1a) [0x55fcb870ca1a]
 2: (()+0x10340) [0x7f96ae901340]
 3: (Client::check_pool_perm(Inode*, int)+0x335) [0x55fcb864bac5]
 4: (Client::get_caps(Inode*, int, int, int*, long)+0x2f) [0x55fcb864cc1f]
 5: (Client::_read(Fh*, long, unsigned long, ceph::buffer::list*)+0x205) [0x55fcb8663f25]
 6: (Client::ll_read(Fh*, long, long, ceph::buffer::list*)+0x8f) [0x55fcb86648ff]
 7: (()+0x17455b) [0x55fcb862655b]
 8: (()+0x1481e) [0x7f96aef8381e]
 9: (()+0x1522b) [0x7f96aef8422b]
 10: (()+0x11e49) [0x7f96aef80e49]
 11: (()+0x8182) [0x7f96ae8f9182]
 12: (clone()+0x6d) [0x7f96ad48347d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

System info

ceph -v
ceph version 9.1.0 (3be81ae6cf17fcf689cd6f187c4615249fea4f61)

dpkg -l | grep fuse
ii  ceph-fuse                             9.1.0-1trusty                    amd64        FUSE-based client for the Ceph distributed file system
ii  fuse                                  2.9.2-4ubuntu4.14.04.1           amd64        Filesystem in Userspace
ii  libfuse2:amd64                        2.9.2-4ubuntu4.14.04.1           amd64        Filesystem in Userspace (library)

uname -a
Linux dfadm01 4.3.0-040300-generic #201511020949 SMP Mon Nov 2 14:50:44 UTC 2015 x86_64 x86_64 x86_64 GNU/Linux

grep ceph /etc/fstab
id=cephfs,keyring=/etc/ceph/client.cephfs.keyring /cephfs fuse.ceph noatime,_netdev,noauto 0 0

I have attached the whole log file.


Files

ceph-client.cephfs.log.1.gz (88.4 KB) ceph-client.cephfs.log.1.gz Eric Eastman, 11/06/2015 11:23 PM
client-snapc.patch (1.05 KB) client-snapc.patch Zheng Yan, 11/09/2015 02:40 AM

Related issues 1 (0 open1 closed)

Copied to CephFS - Backport #13889: infernalis: Segmentation fault accessing file using fuse mountResolvedAbhishek VarshneyActions
Actions

Also available in: Atom PDF