Bug #17184
closed"Segmentation fault" in samba-jewel---basic-mira
0%
Description
This is for jewel 10.2.3 release
Seems to be verified by several last runs
Runs:
http://pulpito.ceph.com/teuthology-2016-08-31_02:35:02-samba-jewel---basic-mira/
http://pulpito.ceph.com/teuthology-2016-08-28_02:35:02-samba-jewel---basic-mira/
Job: 393919
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2016-08-31_02:35:02-samba-jewel---basic-mira/393919/teuthology.log
yuriw@teuthology ~ [15:44:24]> zgrep -i '^ ceph version' /a/teuthology-2016-08-31_02:35:02-samba-jewel---basic-mira/393919/remote/mira048/log/* -b40 -a30 0-2016-08-31 02:52:42.006379 7f46f17caec0 0 ceph version 10.2.2-508-g9bfc0cf (9bfc0cf178dc21b0fe33e0ce3b90a18858abaf1b), process ceph-fuse, pid 31746 149-2016-08-31 02:52:42.008716 7f46f17caec0 -1 init, newargv = 0x55bf6ee9ac60 newargc=11 234-2016-08-31 03:25:17.683959 7f46f17caec0 0 client.4119 destroyed lost open file 0x55bf6eec4f00 on 10000000091.head(faked_ino=0 ref=3675 ll_ref=81918 cap_refs={} open={1=3673} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 589-2016-08-31 03:25:17.685061 7f46f17caec0 -1 *** Caught signal (Segmentation fault) ** 674- in thread 7f46f17caec0 thread_name:ceph-fuse 720- 721: ceph version 10.2.2-508-g9bfc0cf (9bfc0cf178dc21b0fe33e0ce3b90a18858abaf1b) 798- 1: (()+0x29d742) [0x55bf63a34742] 833- 2: (()+0x10330) [0x7f46f0af5330] 867- 3: (Client::_release_fh(Fh*)+0x1b) [0x55bf6398ff4b] 920- 4: (Client::unmount()+0x4c7) [0x55bf6399ebc7] 967- 5: (main()+0x5d5) [0x55bf6392ba85] 1003- 6: (__libc_start_main()+0xf5) [0x7f46ef383f45] 1051- 7: (()+0x198f17) [0x55bf6392ff17] 1086- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. 1179- 1180---- begin dump of recent events --- 1216--10000> 2016-08-31 03:25:17.283870 7f46dce22700 5 client.4119 _release_fh 0x55bf6ff85200 mode 1 on 10000000091.head(faked_ino=0 ref=8664 ll_ref=81918 cap_refs={} open={1=8662} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 1572- -9999> 2016-08-31 03:25:17.283898 7f46dc621700 3 client.4119 ll_release (fh)0x55bf6ff8dc80 10000000091 1678- -9998> 2016-08-31 03:25:17.283932 7f46dc621700 5 client.4119 _release_fh 0x55bf6ff8dc80 mode 1 on 10000000091.head(faked_ino=0 ref=8663 ll_ref=81918 cap_refs={} open={1=8661} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2034- -9997> 2016-08-31 03:25:17.283960 7f46e1e2c700 3 client.4119 ll_release (fh)0x55bf6f4eba80 10000000091 2140- -9996> 2016-08-31 03:25:17.283993 7f46e1e2c700 5 client.4119 _release_fh 0x55bf6f4eba80 mode 1 on 10000000091.head(faked_ino=0 ref=8662 ll_ref=81918 cap_refs={} open={1=8660} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2496- -9995> 2016-08-31 03:25:17.284018 7f46e3232700 3 client.4119 ll_release (fh)0x55bf6ff97000 10000000091 2602- -9994> 2016-08-31 03:25:17.284043 7f46e3232700 5 client.4119 _release_fh 0x55bf6ff97000 mode 1 on 10000000091.head(faked_ino=0 ref=8661 ll_ref=81918 cap_refs={} open={1=8659} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2958- -9993> 2016-08-31 03:25:17.284070 7f46dfe28700 3 client.4119 ll_release (fh)0x55bf6ff6c180 10000000091 3064- -9992> 2016-08-31 03:25:17.284110 7f46dfe28700 5 client.4119 _release_fh 0x55bf6ff6c180 mode 1 on 10000000091.head(faked_ino=0 ref=8660 ll_ref=81918 cap_refs={} open={1=8658} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 3420- -9991> 2016-08-31 03:25:17.284137 7f46e2a31700 3 client.4119 ll_release (fh)0x55bf6ff8de00 10000000091 3526- -9990> 2016-08-31 03:25:17.284168 7f46e2a31700 5 client.4119 _release_fh 0x55bf6ff8de00 mode 1 on 10000000091.head(faked_ino=0 ref=8659 ll_ref=81918 cap_refs={} open={1=8657} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 3882- -9989> 2016-08-31 03:25:17.284197 7f46e0e2a700 3 client.4119 ll_release (fh)0x55bf6ff7d200 10000000091 3988- -9988> 2016-08-31 03:25:17.284230 7f46e0e2a700 5 client.4119 _release_fh 0x55bf6ff7d200 mode 1 on 10000000091.head(faked_ino=0 ref=8658 ll_ref=81918 cap_refs={} open={1=8656} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 4344- -9987> 2016-08-31 03:25:17.284257 7f46dbe20700 3 client.4119 ll_release (fh)0x55bf6ff78a80 10000000091 4450- -9986> 2016-08-31 03:25:17.284287 7f46dbe20700 5 client.4119 _release_fh 0x55bf6ff78a80 mode 1 on 10000000091.head(faked_ino=0 ref=8657 ll_ref=81918 cap_refs={} open={1=8655} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 4806- -9985> 2016-08-31 03:25:17.284315 7f46dd623700 3 client.4119 ll_release (fh)0x55bf6ff70a80 10000000091 4912- -9984> 2016-08-31 03:25:17.284352 7f46dd623700 5 client.4119 _release_fh 0x55bf6ff70a80 mode 1 on 10000000091.head(faked_ino=0 ref=8656 ll_ref=81918 cap_refs={} open={1=8654} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 5268- -9983> 2016-08-31 03:25:17.284378 7f46df627700 3 client.4119 ll_release (fh)0x55bf6f4ebc00 10000000091 5374- -9982> 2016-08-31 03:25:17.284411 7f46df627700 5 client.4119 _release_fh 0x55bf6f4ebc00 mode 1 on 10000000091.head(faked_ino=0 ref=8655 ll_ref=81918 cap_refs={} open={1=8653} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 5730- -9981> 2016-08-31 03:25:17.284441 7f46e4234700 3 client.4119 ll_release (fh)0x55bf6ff85380 10000000091 -- 2302980- -27> 2016-08-31 03:25:17.681775 7f46dd623700 5 client.4119 _release_fh 0x55bf6fc86880 mode 1 on 10000000091.head(faked_ino=0 ref=3686 ll_ref=81918 cap_refs={} open={1=3684} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2303336- -26> 2016-08-31 03:25:17.681879 7f46dee26700 3 client.4119 ll_release (fh)0x55bf6fcc0180 10000000091 2303442- -25> 2016-08-31 03:25:17.681949 7f46dee26700 5 client.4119 _release_fh 0x55bf6fcc0180 mode 1 on 10000000091.head(faked_ino=0 ref=3685 ll_ref=81918 cap_refs={} open={1=3683} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2303798- -24> 2016-08-31 03:25:17.682014 7f46e4234700 3 client.4119 ll_release (fh)0x55bf6fc87a80 10000000091 2303904- -23> 2016-08-31 03:25:17.682098 7f46e4234700 5 client.4119 _release_fh 0x55bf6fc87a80 mode 1 on 10000000091.head(faked_ino=0 ref=3684 ll_ref=81918 cap_refs={} open={1=3682} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2304260- -22> 2016-08-31 03:25:17.682160 7f46dce22700 3 client.4119 ll_release (fh)0x55bf6fcc8480 10000000091 2304366- -21> 2016-08-31 03:25:17.682235 7f46dce22700 5 client.4119 _release_fh 0x55bf6fcc8480 mode 1 on 10000000091.head(faked_ino=0 ref=3683 ll_ref=81918 cap_refs={} open={1=3681} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2304722- -20> 2016-08-31 03:25:17.682309 7f46df627700 3 client.4119 ll_release (fh)0x55bf6fcde100 10000000091 2304828- -19> 2016-08-31 03:25:17.682376 7f46df627700 5 client.4119 _release_fh 0x55bf6fcde100 mode 1 on 10000000091.head(faked_ino=0 ref=3682 ll_ref=81918 cap_refs={} open={1=3680} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2305184- -18> 2016-08-31 03:25:17.682466 7f46dbe20700 3 client.4119 ll_release (fh)0x55bf6fcc4300 10000000091 2305290- -17> 2016-08-31 03:25:17.682482 7f46dbe20700 5 client.4119 _release_fh 0x55bf6fcc4300 mode 1 on 10000000091.head(faked_ino=0 ref=3681 ll_ref=81918 cap_refs={} open={1=3679} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2305646- -16> 2016-08-31 03:25:17.682579 7f46e0e2a700 3 client.4119 ll_release (fh)0x55bf6fce6b80 10000000091 2305752- -15> 2016-08-31 03:25:17.682638 7f46e0e2a700 5 client.4119 _release_fh 0x55bf6fce6b80 mode 1 on 10000000091.head(faked_ino=0 ref=3680 ll_ref=81918 cap_refs={} open={1=3678} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2306108- -14> 2016-08-31 03:25:17.682742 7f46e1e2c700 3 client.4119 ll_forget 10000000092 2 2306195- -13> 2016-08-31 03:25:17.682905 7f46dfe28700 3 client.4119 ll_release (fh)0x55bf6fcd5380 10000000091 2306301- -12> 2016-08-31 03:25:17.682916 7f46dfe28700 5 client.4119 _release_fh 0x55bf6fcd5380 mode 1 on 10000000091.head(faked_ino=0 ref=3679 ll_ref=81918 cap_refs={} open={1=3677} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2306657- -11> 2016-08-31 03:25:17.682951 7f46e0629700 3 client.4119 ll_forget 1000000005e 21 2306745- -10> 2016-08-31 03:25:17.683072 7f46e3a33700 3 client.4119 ll_forget 1000000000c 48 2306833- -9> 2016-08-31 03:25:17.683087 7f46e3a33700 3 client.4119 ll_forget 1 2 2306910- -8> 2016-08-31 03:25:17.683102 7f46da61d700 3 client.4119 ll_release (fh)0x55bf6fcd8f00 10000000091 2307016- -7> 2016-08-31 03:25:17.683184 7f46da61d700 5 client.4119 _release_fh 0x55bf6fcd8f00 mode 1 on 10000000091.head(faked_ino=0 ref=3678 ll_ref=81918 cap_refs={} open={1=3676} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2307372- -6> 2016-08-31 03:25:17.683237 7f46e162b700 3 client.4119 ll_release (fh)0x55bf6fcd5200 10000000091 2307478- -5> 2016-08-31 03:25:17.683306 7f46e162b700 5 client.4119 _release_fh 0x55bf6fcd5200 mode 1 on 10000000091.head(faked_ino=0 ref=3677 ll_ref=81918 cap_refs={} open={1=3675} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2307834- -4> 2016-08-31 03:25:17.683399 7f46e3232700 3 client.4119 ll_release (fh)0x55bf6fce6d00 10000000091 2307940- -3> 2016-08-31 03:25:17.683479 7f46e3232700 5 client.4119 _release_fh 0x55bf6fce6d00 mode 1 on 10000000091.head(faked_ino=0 ref=3676 ll_ref=81918 cap_refs={} open={1=3674} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2308296- -2> 2016-08-31 03:25:17.683943 7f46f17caec0 2 client.4119 unmounting 2308370- -1> 2016-08-31 03:25:17.683959 7f46f17caec0 0 client.4119 destroyed lost open file 0x55bf6eec4f00 on 10000000091.head(faked_ino=0 ref=3675 ll_ref=81918 cap_refs={} open={1=3673} mode=100744 size=0/0 mtime=2016-08-31 03:19:48.568078 caps=pAsLsXsFscr(0=pAsLsXsFscr) objectset[10000000091 ts 0/0 objects 0 dirty_or_tx 0] parents=0x55bf6f65ff00 0x55bf6f201e00) 2308733- 0> 2016-08-31 03:25:17.685061 7f46f17caec0 -1 *** Caught signal (Segmentation fault) ** 2308826- in thread 7f46f17caec0 thread_name:ceph-fuse 2308872- 2308873: ceph version 10.2.2-508-g9bfc0cf (9bfc0cf178dc21b0fe33e0ce3b90a18858abaf1b) 2308950- 1: (()+0x29d742) [0x55bf63a34742] 2308985- 2: (()+0x10330) [0x7f46f0af5330] 2309019- 3: (Client::_release_fh(Fh*)+0x1b) [0x55bf6398ff4b] 2309072- 4: (Client::unmount()+0x4c7) [0x55bf6399ebc7] 2309119- 5: (main()+0x5d5) [0x55bf6392ba85] 2309155- 6: (__libc_start_main()+0xf5) [0x7f46ef383f45] 2309203- 7: (()+0x198f17) [0x55bf6392ff17] 2309238- NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Updated by Greg Farnum over 7 years ago
- Project changed from Ceph to CephFS
- Category set to 43
- Priority changed from Urgent to High
I'm not seeing this at all on master (just from browsing http://pulpito.ceph.com/?suite=samba#)
Jewel has a core dump at least as far back as http://pulpito.ceph.com/teuthology-2016-08-21_02:35:02-samba-jewel---basic-mira/, but not in http://pulpito.ceph.com/teuthology-2016-08-24_02:35:02-samba-jewel---basic-mira/
Updated by Greg Farnum over 7 years ago
Okay, there's also one at http://pulpito.ceph.com/teuthology-2016-08-14_02:35:02-samba-jewel---basic-mira/
That seems to be the first. Going through the commits, there's nothing clearly responsible for this, especially since the failure is intermittent. I pushed jewel-samba-1 as a first guess for bisecting and will run it through several times once built, though. :/
Updated by Greg Farnum over 7 years ago
- Status changed from New to 12
Well, I ran the test in question (samba-basic, btrfs, install, fuse, smbtorture) 9 times and got 4 failures. There aren't any distinguishing differences I can see (both passes and failures are on ubuntu, etc) This is going to be tedious but I'm scheduling more runs and building another branch to try and do a useful bisect. :(
Updated by Greg Farnum over 7 years ago
Ran 6 times against commit:e400999a2cb0972919e35dd8510f8d85f48ceace (jewel-samba-1) and got zero failures. That's one prior to the last backport to client code that interested with the failures starting to show up (I've no idea why commit:f0146196ccfbcfd923191f63d93e4e81219523b1, merging in "client: fstat should take CEPH_STAT_CAP_INODE_ALL" would break anything, but based on timing it's my first guess), which gitbuilders are (re)building now for more tests. If they show up there, we'll have a culprit; otherwise more difficult bisecting will commence.
Updated by Zheng Yan over 7 years ago
while (!ll_unclosed_fh_set.empty()) { set<Fh*>::iterator it = ll_unclosed_fh_set.begin(); ll_unclosed_fh_set.erase(*it); ldout(cct, 0) << " destroyed lost open file " << *it << " on " << *((*it)->inode) << dendl; _release_fh(*it); ^^^ the iterator is invalid because it was deleted above }
Updated by Greg Farnum over 7 years ago
- Assignee set to Zheng Yan
Can you fix that up and make sure it was just a backport error, Zheng? We aren't seeing this in master at all so I presume it's working there...
Updated by chuan jiang over 7 years ago
Maybe this is a solved problem
set<Fh*>::iterator it = ll_unclosed_fh_set.begin();while (!ll_unclosed_fh_set.empty()) {
Fh *fh = *it;
ll_unclosed_fh_set.erase(fh);
ldout(cct, 0) << " destroyed lost open file " << fh << " on " << *(fh->inode) << dendl;
_release_fh(fh);
}
Updated by John Spray over 7 years ago
The area of code mentioned above was fixed in master for http://tracker.ceph.com/issues/16764
That wasn't marked for backport because the issue had only come up in a somewhat obscure case, but on reflection we should backport it.
Updated by Patrick Donnelly about 5 years ago
- Status changed from 12 to Rejected
Updated by Patrick Donnelly about 5 years ago
- Category deleted (
43) - Labels (FS) Samba/CIFS added