Actions
Bug #20988
closedclient: dual client segfault with racing ceph_shutdown
% Done:
0%
Source:
Development
Tags:
Backport:
jewel,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I have a testcase that I'm working on that has two threads, each with their own ceph_mount_info. If those threads end up doing racing ceph_shutdown calls, I see crashes:
Core was generated by `./bin/ceph_test_libcephfs --gtest_filter=LibCephFS.Delegation'. Program terminated with signal SIGSEGV, Segmentation fault. #0 lockdep_will_unlock (name=0x7fffc008a950 "PerfCountersCollection", id=<optimized out>) at /home/jlayton/git/ceph/src/common/lockdep.cc:369 369 lockdep_dout(20) << "_will_unlock " << name << dendl; [Current thread is 1 (Thread 0x7fffb2ffd700 (LWP 8319))] Missing separate debuginfos, use: dnf debuginfo-install glibc-2.25-7.fc26.x86_64 libblkid-2.30.1-1.fc26.x86_64 libgcc-7.1.1-3.fc26.x86_64 libstdc++-7.1.1-3.fc26.x86_64 libuuid-2.30.1-1.fc26.x86_64 lttng-ust-2.9.0-2.fc26.x86_64 nspr-4.15.0-1.fc26.x86_64 nss-3.31.0-1.1.fc26.x86_64 nss-softokn-3.31.0-1.0.fc26.x86_64 nss-softokn-freebl-3.31.0-1.0.fc26.x86_64 nss-util-3.31.0-1.0.fc26.x86_64 sqlite-libs-3.19.3-1.fc26.x86_64 userspace-rcu-0.9.3-2.fc26.x86_64 zlib-1.2.11-2.fc26.x86_64 (gdb) bt #0 lockdep_will_unlock (name=0x7fffc008a950 "PerfCountersCollection", id=<optimized out>) at /home/jlayton/git/ceph/src/common/lockdep.cc:369 #1 0x00007fffedb85820 in Mutex::_will_unlock (this=0x7fffc008a858) at /home/jlayton/git/ceph/src/common/Mutex.h:62 #2 Mutex::Unlock (this=this@entry=0x7fffc008a858) at /home/jlayton/git/ceph/src/common/Mutex.cc:120 #3 0x00007fffedb812b0 in Mutex::Locker::~Locker (this=<synthetic pointer>, __in_chrg=<optimized out>) at /home/jlayton/git/ceph/src/common/Mutex.h:118 #4 PerfCountersCollection::remove (this=0x7fffc008a850, l=<optimized out>) at /home/jlayton/git/ceph/src/common/perf_counters.cc:62 #5 0x00007ffff7b87e6e in ObjectCacher::perf_stop (this=0x7fffc0125fd0) at /home/jlayton/git/ceph/src/osdc/ObjectCacher.cc:688 #6 0x00007ffff7b9edc1 in ObjectCacher::~ObjectCacher (this=0x7fffc0125fd0, __in_chrg=<optimized out>) at /home/jlayton/git/ceph/src/osdc/ObjectCacher.cc:638 #7 0x00007ffff7b218ea in std::default_delete<ObjectCacher>::operator() (this=<optimized out>, __ptr=0x7fffc0125fd0) at /usr/include/c++/7/bits/unique_ptr.h:78 #8 std::unique_ptr<ObjectCacher, std::default_delete<ObjectCacher> >::~unique_ptr (this=0x7fffc01227a8, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:268 #9 Client::~Client (this=0x7fffc0121d40, __in_chrg=<optimized out>) at /home/jlayton/git/ceph/src/client/Client.cc:301 #10 0x00007ffff7b21c99 in StandaloneClient::~StandaloneClient (this=0x7fffc0121d40, __in_chrg=<optimized out>) at /home/jlayton/git/ceph/src/client/Client.cc:13420 #11 0x00007ffff7ad2323 in ceph_mount_info::shutdown (this=0x7fffc008e590) at /home/jlayton/git/ceph/src/libcephfs.cc:164 #12 ceph_shutdown (cmount=0x7fffc008e590) at /home/jlayton/git/ceph/src/libcephfs.cc:357 #13 0x0000555555607f70 in breaker_func (filename=<optimized out>) at /home/jlayton/git/ceph/src/test/libcephfs/deleg.cc:53 #14 0x00007ffff6fb002f in ?? () from /lib64/libstdc++.so.6 #15 0x00007ffff78a836d in start_thread () from /lib64/libpthread.so.0 #16 0x00007ffff6706b8f in clone () from /lib64/libc.so.6
If I instead ensure that the calls are serialized, they don't crash.
Files
Actions