Project

General

Profile

Actions

Bug #20988

closed

client: dual client segfault with racing ceph_shutdown

Added by Jeff Layton over 6 years ago. Updated about 6 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Development
Tags:
Backport:
jewel,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
Client
Labels (FS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I have a testcase that I'm working on that has two threads, each with their own ceph_mount_info. If those threads end up doing racing ceph_shutdown calls, I see crashes:

Core was generated by `./bin/ceph_test_libcephfs --gtest_filter=LibCephFS.Delegation'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  lockdep_will_unlock (name=0x7fffc008a950 "PerfCountersCollection", id=<optimized out>) at /home/jlayton/git/ceph/src/common/lockdep.cc:369
369      lockdep_dout(20) << "_will_unlock " << name << dendl;
[Current thread is 1 (Thread 0x7fffb2ffd700 (LWP 8319))]
Missing separate debuginfos, use: dnf debuginfo-install glibc-2.25-7.fc26.x86_64 libblkid-2.30.1-1.fc26.x86_64 libgcc-7.1.1-3.fc26.x86_64 libstdc++-7.1.1-3.fc26.x86_64 libuuid-2.30.1-1.fc26.x86_64 lttng-ust-2.9.0-2.fc26.x86_64 nspr-4.15.0-1.fc26.x86_64 nss-3.31.0-1.1.fc26.x86_64 nss-softokn-3.31.0-1.0.fc26.x86_64 nss-softokn-freebl-3.31.0-1.0.fc26.x86_64 nss-util-3.31.0-1.0.fc26.x86_64 sqlite-libs-3.19.3-1.fc26.x86_64 userspace-rcu-0.9.3-2.fc26.x86_64 zlib-1.2.11-2.fc26.x86_64
(gdb) bt
#0  lockdep_will_unlock (name=0x7fffc008a950 "PerfCountersCollection", id=<optimized out>) at /home/jlayton/git/ceph/src/common/lockdep.cc:369
#1  0x00007fffedb85820 in Mutex::_will_unlock (this=0x7fffc008a858) at /home/jlayton/git/ceph/src/common/Mutex.h:62
#2  Mutex::Unlock (this=this@entry=0x7fffc008a858) at /home/jlayton/git/ceph/src/common/Mutex.cc:120
#3  0x00007fffedb812b0 in Mutex::Locker::~Locker (this=<synthetic pointer>, __in_chrg=<optimized out>) at /home/jlayton/git/ceph/src/common/Mutex.h:118
#4  PerfCountersCollection::remove (this=0x7fffc008a850, l=<optimized out>) at /home/jlayton/git/ceph/src/common/perf_counters.cc:62
#5  0x00007ffff7b87e6e in ObjectCacher::perf_stop (this=0x7fffc0125fd0) at /home/jlayton/git/ceph/src/osdc/ObjectCacher.cc:688
#6  0x00007ffff7b9edc1 in ObjectCacher::~ObjectCacher (this=0x7fffc0125fd0, __in_chrg=<optimized out>) at /home/jlayton/git/ceph/src/osdc/ObjectCacher.cc:638
#7  0x00007ffff7b218ea in std::default_delete<ObjectCacher>::operator() (this=<optimized out>, __ptr=0x7fffc0125fd0) at /usr/include/c++/7/bits/unique_ptr.h:78
#8  std::unique_ptr<ObjectCacher, std::default_delete<ObjectCacher> >::~unique_ptr (this=0x7fffc01227a8, __in_chrg=<optimized out>)
    at /usr/include/c++/7/bits/unique_ptr.h:268
#9  Client::~Client (this=0x7fffc0121d40, __in_chrg=<optimized out>) at /home/jlayton/git/ceph/src/client/Client.cc:301
#10 0x00007ffff7b21c99 in StandaloneClient::~StandaloneClient (this=0x7fffc0121d40, __in_chrg=<optimized out>) at /home/jlayton/git/ceph/src/client/Client.cc:13420
#11 0x00007ffff7ad2323 in ceph_mount_info::shutdown (this=0x7fffc008e590) at /home/jlayton/git/ceph/src/libcephfs.cc:164
#12 ceph_shutdown (cmount=0x7fffc008e590) at /home/jlayton/git/ceph/src/libcephfs.cc:357
#13 0x0000555555607f70 in breaker_func (filename=<optimized out>) at /home/jlayton/git/ceph/src/test/libcephfs/deleg.cc:53
#14 0x00007ffff6fb002f in ?? () from /lib64/libstdc++.so.6
#15 0x00007ffff78a836d in start_thread () from /lib64/libpthread.so.0
#16 0x00007ffff6706b8f in clone () from /lib64/libc.so.6

If I instead ensure that the calls are serialized, they don't crash.


Files

0001-client-test-shutdown-race.patch (1.27 KB) 0001-client-test-shutdown-race.patch Updated testcase Jeff Layton, 08/21/2017 06:14 PM

Related issues 2 (0 open2 closed)

Copied to CephFS - Backport #21525: luminous: client: dual client segfault with racing ceph_shutdownResolvedNathan CutlerActions
Copied to CephFS - Backport #21526: jewel: client: dual client segfault with racing ceph_shutdownClosedNathan CutlerActions
Actions

Also available in: Atom PDF