Project

General

Profile

Actions

Bug #17184

closed

"Segmentation fault" in samba-jewel---basic-mira

Added by Yuri Weinstein over 7 years ago. Updated about 5 years ago.

Status:
Rejected
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
samba
Component(FS):
Labels (FS):
Samba/CIFS
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This is for jewel 10.2.3 release
Seems to be verified by several last runs

Runs:
http://pulpito.ceph.com/teuthology-2016-08-31_02:35:02-samba-jewel---basic-mira/
http://pulpito.ceph.com/teuthology-2016-08-28_02:35:02-samba-jewel---basic-mira/
Job: 393919
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2016-08-31_02:35:02-samba-jewel---basic-mira/393919/teuthology.log

  yuriw@teuthology ~ [15:44:24]> zgrep -i '^ ceph version' /a/teuthology-2016-08-31_02:35:02-samba-jewel---basic-mira/393919/remote/mira048/log/* -b40 -a30
0-2016-08-31 02:52:42.006379 7f46f17caec0  0 ceph version 10.2.2-508-g9bfc0cf (9bfc0cf178dc21b0fe33e0ce3b90a18858abaf1b), process ceph-fuse, pid 31746
149-2016-08-31 02:52:42.008716 7f46f17caec0 -1 init, newargv = 0x55bf6ee9ac60 newargc=11
234-2016-08-31 03:25:17.683959 7f46f17caec0  0 client.4119  destroyed lost open file 0x55bf6eec4f00 on 10000000091.head(faked_ino=0 ref=3675 ll_ref=81918 cap_refs={} open={1=3673} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
589-2016-08-31 03:25:17.685061 7f46f17caec0 -1 *** Caught signal (Segmentation fault) **
674- in thread 7f46f17caec0 thread_name:ceph-fuse
720-
721: ceph version 10.2.2-508-g9bfc0cf (9bfc0cf178dc21b0fe33e0ce3b90a18858abaf1b)
798- 1: (()+0x29d742) [0x55bf63a34742]
833- 2: (()+0x10330) [0x7f46f0af5330]
867- 3: (Client::_release_fh(Fh*)+0x1b) [0x55bf6398ff4b]
920- 4: (Client::unmount()+0x4c7) [0x55bf6399ebc7]
967- 5: (main()+0x5d5) [0x55bf6392ba85]
1003- 6: (__libc_start_main()+0xf5) [0x7f46ef383f45]
1051- 7: (()+0x198f17) [0x55bf6392ff17]
1086- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
1179-
1180---- begin dump of recent events ---
1216--10000> 2016-08-31 03:25:17.283870 7f46dce22700  5 client.4119 _release_fh 0x55bf6ff85200 mode 1 on 10000000091.head(faked_ino=0 ref=8664 ll_ref=81918 cap_refs={} open={1=8662} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
1572- -9999> 2016-08-31 03:25:17.283898 7f46dc621700  3 client.4119 ll_release (fh)0x55bf6ff8dc80 10000000091 
1678- -9998> 2016-08-31 03:25:17.283932 7f46dc621700  5 client.4119 _release_fh 0x55bf6ff8dc80 mode 1 on 10000000091.head(faked_ino=0 ref=8663 ll_ref=81918 cap_refs={} open={1=8661} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2034- -9997> 2016-08-31 03:25:17.283960 7f46e1e2c700  3 client.4119 ll_release (fh)0x55bf6f4eba80 10000000091 
2140- -9996> 2016-08-31 03:25:17.283993 7f46e1e2c700  5 client.4119 _release_fh 0x55bf6f4eba80 mode 1 on 10000000091.head(faked_ino=0 ref=8662 ll_ref=81918 cap_refs={} open={1=8660} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2496- -9995> 2016-08-31 03:25:17.284018 7f46e3232700  3 client.4119 ll_release (fh)0x55bf6ff97000 10000000091 
2602- -9994> 2016-08-31 03:25:17.284043 7f46e3232700  5 client.4119 _release_fh 0x55bf6ff97000 mode 1 on 10000000091.head(faked_ino=0 ref=8661 ll_ref=81918 cap_refs={} open={1=8659} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2958- -9993> 2016-08-31 03:25:17.284070 7f46dfe28700  3 client.4119 ll_release (fh)0x55bf6ff6c180 10000000091 
3064- -9992> 2016-08-31 03:25:17.284110 7f46dfe28700  5 client.4119 _release_fh 0x55bf6ff6c180 mode 1 on 10000000091.head(faked_ino=0 ref=8660 ll_ref=81918 cap_refs={} open={1=8658} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
3420- -9991> 2016-08-31 03:25:17.284137 7f46e2a31700  3 client.4119 ll_release (fh)0x55bf6ff8de00 10000000091 
3526- -9990> 2016-08-31 03:25:17.284168 7f46e2a31700  5 client.4119 _release_fh 0x55bf6ff8de00 mode 1 on 10000000091.head(faked_ino=0 ref=8659 ll_ref=81918 cap_refs={} open={1=8657} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
3882- -9989> 2016-08-31 03:25:17.284197 7f46e0e2a700  3 client.4119 ll_release (fh)0x55bf6ff7d200 10000000091 
3988- -9988> 2016-08-31 03:25:17.284230 7f46e0e2a700  5 client.4119 _release_fh 0x55bf6ff7d200 mode 1 on 10000000091.head(faked_ino=0 ref=8658 ll_ref=81918 cap_refs={} open={1=8656} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
4344- -9987> 2016-08-31 03:25:17.284257 7f46dbe20700  3 client.4119 ll_release (fh)0x55bf6ff78a80 10000000091 
4450- -9986> 2016-08-31 03:25:17.284287 7f46dbe20700  5 client.4119 _release_fh 0x55bf6ff78a80 mode 1 on 10000000091.head(faked_ino=0 ref=8657 ll_ref=81918 cap_refs={} open={1=8655} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
4806- -9985> 2016-08-31 03:25:17.284315 7f46dd623700  3 client.4119 ll_release (fh)0x55bf6ff70a80 10000000091 
4912- -9984> 2016-08-31 03:25:17.284352 7f46dd623700  5 client.4119 _release_fh 0x55bf6ff70a80 mode 1 on 10000000091.head(faked_ino=0 ref=8656 ll_ref=81918 cap_refs={} open={1=8654} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
5268- -9983> 2016-08-31 03:25:17.284378 7f46df627700  3 client.4119 ll_release (fh)0x55bf6f4ebc00 10000000091 
5374- -9982> 2016-08-31 03:25:17.284411 7f46df627700  5 client.4119 _release_fh 0x55bf6f4ebc00 mode 1 on 10000000091.head(faked_ino=0 ref=8655 ll_ref=81918 cap_refs={} open={1=8653} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
5730- -9981> 2016-08-31 03:25:17.284441 7f46e4234700  3 client.4119 ll_release (fh)0x55bf6ff85380 10000000091 
--
2302980-   -27> 2016-08-31 03:25:17.681775 7f46dd623700  5 client.4119 _release_fh 0x55bf6fc86880 mode 1 on 10000000091.head(faked_ino=0 ref=3686 ll_ref=81918 cap_refs={} open={1=3684} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2303336-   -26> 2016-08-31 03:25:17.681879 7f46dee26700  3 client.4119 ll_release (fh)0x55bf6fcc0180 10000000091 
2303442-   -25> 2016-08-31 03:25:17.681949 7f46dee26700  5 client.4119 _release_fh 0x55bf6fcc0180 mode 1 on 10000000091.head(faked_ino=0 ref=3685 ll_ref=81918 cap_refs={} open={1=3683} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2303798-   -24> 2016-08-31 03:25:17.682014 7f46e4234700  3 client.4119 ll_release (fh)0x55bf6fc87a80 10000000091 
2303904-   -23> 2016-08-31 03:25:17.682098 7f46e4234700  5 client.4119 _release_fh 0x55bf6fc87a80 mode 1 on 10000000091.head(faked_ino=0 ref=3684 ll_ref=81918 cap_refs={} open={1=3682} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2304260-   -22> 2016-08-31 03:25:17.682160 7f46dce22700  3 client.4119 ll_release (fh)0x55bf6fcc8480 10000000091 
2304366-   -21> 2016-08-31 03:25:17.682235 7f46dce22700  5 client.4119 _release_fh 0x55bf6fcc8480 mode 1 on 10000000091.head(faked_ino=0 ref=3683 ll_ref=81918 cap_refs={} open={1=3681} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2304722-   -20> 2016-08-31 03:25:17.682309 7f46df627700  3 client.4119 ll_release (fh)0x55bf6fcde100 10000000091 
2304828-   -19> 2016-08-31 03:25:17.682376 7f46df627700  5 client.4119 _release_fh 0x55bf6fcde100 mode 1 on 10000000091.head(faked_ino=0 ref=3682 ll_ref=81918 cap_refs={} open={1=3680} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2305184-   -18> 2016-08-31 03:25:17.682466 7f46dbe20700  3 client.4119 ll_release (fh)0x55bf6fcc4300 10000000091 
2305290-   -17> 2016-08-31 03:25:17.682482 7f46dbe20700  5 client.4119 _release_fh 0x55bf6fcc4300 mode 1 on 10000000091.head(faked_ino=0 ref=3681 ll_ref=81918 cap_refs={} open={1=3679} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2305646-   -16> 2016-08-31 03:25:17.682579 7f46e0e2a700  3 client.4119 ll_release (fh)0x55bf6fce6b80 10000000091 
2305752-   -15> 2016-08-31 03:25:17.682638 7f46e0e2a700  5 client.4119 _release_fh 0x55bf6fce6b80 mode 1 on 10000000091.head(faked_ino=0 ref=3680 ll_ref=81918 cap_refs={} open={1=3678} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2306108-   -14> 2016-08-31 03:25:17.682742 7f46e1e2c700  3 client.4119 ll_forget 10000000092 2
2306195-   -13> 2016-08-31 03:25:17.682905 7f46dfe28700  3 client.4119 ll_release (fh)0x55bf6fcd5380 10000000091 
2306301-   -12> 2016-08-31 03:25:17.682916 7f46dfe28700  5 client.4119 _release_fh 0x55bf6fcd5380 mode 1 on 10000000091.head(faked_ino=0 ref=3679 ll_ref=81918 cap_refs={} open={1=3677} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2306657-   -11> 2016-08-31 03:25:17.682951 7f46e0629700  3 client.4119 ll_forget 1000000005e 21
2306745-   -10> 2016-08-31 03:25:17.683072 7f46e3a33700  3 client.4119 ll_forget 1000000000c 48
2306833-    -9> 2016-08-31 03:25:17.683087 7f46e3a33700  3 client.4119 ll_forget 1 2
2306910-    -8> 2016-08-31 03:25:17.683102 7f46da61d700  3 client.4119 ll_release (fh)0x55bf6fcd8f00 10000000091 
2307016-    -7> 2016-08-31 03:25:17.683184 7f46da61d700  5 client.4119 _release_fh 0x55bf6fcd8f00 mode 1 on 10000000091.head(faked_ino=0 ref=3678 ll_ref=81918 cap_refs={} open={1=3676} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2307372-    -6> 2016-08-31 03:25:17.683237 7f46e162b700  3 client.4119 ll_release (fh)0x55bf6fcd5200 10000000091 
2307478-    -5> 2016-08-31 03:25:17.683306 7f46e162b700  5 client.4119 _release_fh 0x55bf6fcd5200 mode 1 on 10000000091.head(faked_ino=0 ref=3677 ll_ref=81918 cap_refs={} open={1=3675} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2307834-    -4> 2016-08-31 03:25:17.683399 7f46e3232700  3 client.4119 ll_release (fh)0x55bf6fce6d00 10000000091 
2307940-    -3> 2016-08-31 03:25:17.683479 7f46e3232700  5 client.4119 _release_fh 0x55bf6fce6d00 mode 1 on 10000000091.head(faked_ino=0 ref=3676 ll_ref=81918 cap_refs={} open={1=3674} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2308296-    -2> 2016-08-31 03:25:17.683943 7f46f17caec0  2 client.4119 unmounting
2308370-    -1> 2016-08-31 03:25:17.683959 7f46f17caec0  0 client.4119  destroyed lost open file 0x55bf6eec4f00 on 10000000091.head(faked_ino=0 ref=3675 ll_ref=81918 cap_refs={} open={1=3673} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00)
2308733-     0> 2016-08-31 03:25:17.685061 7f46f17caec0 -1 *** Caught signal (Segmentation fault) **
2308826- in thread 7f46f17caec0 thread_name:ceph-fuse
2308872-
2308873: ceph version 10.2.2-508-g9bfc0cf (9bfc0cf178dc21b0fe33e0ce3b90a18858abaf1b)
2308950- 1: (()+0x29d742) [0x55bf63a34742]
2308985- 2: (()+0x10330) [0x7f46f0af5330]
2309019- 3: (Client::_release_fh(Fh*)+0x1b) [0x55bf6398ff4b]
2309072- 4: (Client::unmount()+0x4c7) [0x55bf6399ebc7]
2309119- 5: (main()+0x5d5) [0x55bf6392ba85]
2309155- 6: (__libc_start_main()+0xf5) [0x7f46ef383f45]
2309203- 7: (()+0x198f17) [0x55bf6392ff17]
2309238- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Actions #1

Updated by Greg Farnum over 7 years ago

  • Project changed from Ceph to CephFS
  • Category set to 43
  • Priority changed from Urgent to High
Actions #2

Updated by Greg Farnum over 7 years ago

Okay, there's also one at http://pulpito.ceph.com/teuthology-2016-08-14_02:35:02-samba-jewel---basic-mira/

That seems to be the first. Going through the commits, there's nothing clearly responsible for this, especially since the failure is intermittent. I pushed jewel-samba-1 as a first guess for bisecting and will run it through several times once built, though. :/

Actions #3

Updated by Greg Farnum over 7 years ago

  • Status changed from New to 12

Well, I ran the test in question (samba-basic, btrfs, install, fuse, smbtorture) 9 times and got 4 failures. There aren't any distinguishing differences I can see (both passes and failures are on ubuntu, etc) This is going to be tedious but I'm scheduling more runs and building another branch to try and do a useful bisect. :(

Actions #4

Updated by Greg Farnum over 7 years ago

Ran 6 times against commit:e400999a2cb0972919e35dd8510f8d85f48ceace (jewel-samba-1) and got zero failures. That's one prior to the last backport to client code that interested with the failures starting to show up (I've no idea why commit:f0146196ccfbcfd923191f63d93e4e81219523b1, merging in "client: fstat should take CEPH_STAT_CAP_INODE_ALL" would break anything, but based on timing it's my first guess), which gitbuilders are (re)building now for more tests. If they show up there, we'll have a culprit; otherwise more difficult bisecting will commence.

Actions #5

Updated by Zheng Yan over 7 years ago

  while (!ll_unclosed_fh_set.empty()) {
    set<Fh*>::iterator it = ll_unclosed_fh_set.begin();
    ll_unclosed_fh_set.erase(*it);
    ldout(cct, 0) << " destroyed lost open file " << *it << " on " << *((*it)->inode) << dendl;
    _release_fh(*it);
                ^^^ the iterator is invalid because it was deleted above
  }
Actions #6

Updated by Greg Farnum over 7 years ago

  • Assignee set to Zheng Yan

Can you fix that up and make sure it was just a backport error, Zheng? We aren't seeing this in master at all so I presume it's working there...

Actions #7

Updated by chuan jiang over 7 years ago

Maybe this is a solved problem

while (!ll_unclosed_fh_set.empty()) {

set<Fh*>::iterator it = ll_unclosed_fh_set.begin();
Fh *fh = *it;
ll_unclosed_fh_set.erase(fh);
ldout(cct, 0) << " destroyed lost open file " << fh << " on " << *(fh->inode) << dendl;
_release_fh(fh);
}
Actions #8

Updated by John Spray over 7 years ago

The area of code mentioned above was fixed in master for http://tracker.ceph.com/issues/16764

That wasn't marked for backport because the issue had only come up in a somewhat obscure case, but on reflection we should backport it.

Actions #9

Updated by Patrick Donnelly about 5 years ago

  • Status changed from 12 to Rejected
Actions #10

Updated by Patrick Donnelly about 5 years ago

  • Category deleted (43)
  • Labels (FS) Samba/CIFS added
Actions

Also available in: Atom PDF