Project

General

Profile

Bug #37975

assert failure in OSDService::shutdown()

Added by Kefu Chai 7 months ago. Updated 7 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
Correctness/Safety
Target version:
-
Start date:
01/20/2019
Due date:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:

Description

#0  raise (sig=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#1  0x00005641f5cf7a89 in reraise_fatal (signum=6) at /var/ssd/ceph/src/global/signal_handler.cc:81
#2  0x00005641f5cf89ad in handle_fatal_signal (signum=6) at /var/ssd/ceph/src/global/signal_handler.cc:298
#3  <signal handler called>
#4  __GI_raise (sig=sig@entry=6) at ../sysdeps/unix/sysv/linux/raise.c:50
#5  0x00007fd742c12535 in __GI_abort () at abort.c:79
#6  0x00007fd742fda943 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007fd742fe0896 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#8  0x00007fd742fdf989 in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#9  0x00007fd742fe02d5 in __gxx_personality_v0 () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#10 0x00007fd742dc1d73 in ?? () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#11 0x00007fd742dc22d1 in _Unwind_RaiseException () from /lib/x86_64-linux-gnu/libgcc_s.so.1
#12 0x00007fd742fe0af7 in __cxa_throw () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#13 0x00005641f52eb83b in ceph::mutex_debug_detail::mutex_debug_impl<true>::try_lock_impl (this=0x564201d9f7b8) at /var/ssd/ceph/src/common/mutex_debug.h:149
#14 0x00005641f52e89db in ceph::mutex_debug_detail::mutex_debug_impl<true>::try_lock (this=0x564201d9f7b8, no_lockdep=false) at /var/ssd/ceph/src/common/mutex_debug.h:174
#15 0x00005641f52e5e59 in ceph::mutex_debug_detail::mutex_debug_impl<true>::lock (this=0x564201d9f7b8, no_lockdep=false) at /var/ssd/ceph/src/common/mutex_debug.h:187
#16 0x00005641f52e361d in std::lock_guard<ceph::mutex_debug_detail::mutex_debug_impl<true> >::lock_guard (this=0x7fd73d4971f0, __m=...) at /usr/include/c++/8/bits/std_mutex.h:162
#17 0x00005641f5acb907 in BlueStore::OnodeSpace::clear (this=0x564201a51a68) at /var/ssd/ceph/src/os/bluestore/BlueStore.cc:1528
#18 0x00005641f5b62cb4 in BlueStore::OnodeSpace::~OnodeSpace (this=0x564201a51a68, __in_chrg=<optimized out>) at /var/ssd/ceph/src/os/bluestore/BlueStore.h:1342
#19 0x00005641f5bd1290 in BlueStore::Collection::~Collection (this=0x564201a518c0, __in_chrg=<optimized out>) at /var/ssd/ceph/src/os/bluestore/BlueStore.h:1366
#20 0x00005641f5bd12ec in BlueStore::Collection::~Collection (this=0x564201a518c0, __in_chrg=<optimized out>) at /var/ssd/ceph/src/os/bluestore/BlueStore.h:1366
#21 0x00005641f537ae79 in RefCountedObject::put (this=0x564201a518c0) at /var/ssd/ceph/src/common/RefCountedObj.h:64
#22 0x00005641f52fa4b7 in intrusive_ptr_release (p=0x564201a518c0) at /var/ssd/ceph/src/common/RefCountedObj.h:174
#23 0x00005641f53a67d1 in boost::intrusive_ptr<ObjectStore::CollectionImpl>::~intrusive_ptr (this=0x564202a0cc50, __in_chrg=<optimized out>)
    at /opt/ceph/include/boost/smart_ptr/intrusive_ptr.hpp:98
#24 0x00005641f54850b8 in OSDriver::~OSDriver (this=0x564202a0cc40, __in_chrg=<optimized out>) at /var/ssd/ceph/src/osd/SnapMapper.h:30
#25 0x00005641f54a9568 in PG::~PG (this=0x564202a0c800, __in_chrg=<optimized out>) at /var/ssd/ceph/src/osd/PG.cc:368
#26 0x00005641f56a3e2d in PrimaryLogPG::~PrimaryLogPG (this=0x564202a0c800, __in_chrg=<optimized out>) at /var/ssd/ceph/src/osd/PrimaryLogPG.h:1448
#27 0x00005641f56a3e54 in PrimaryLogPG::~PrimaryLogPG (this=0x564202a0c800, __in_chrg=<optimized out>) at /var/ssd/ceph/src/osd/PrimaryLogPG.h:1448
#28 0x00005641f54a75e3 in PG::put (this=0x564202a0c800, tag=0x5641f674fd5e "intptr") at /var/ssd/ceph/src/osd/PG.cc:176
#29 0x00005641f538b469 in intrusive_ptr_release (pg=0x564202a0c800) at /var/ssd/ceph/src/osd/PG.h:555
#30 0x00005641f53a9189 in boost::intrusive_ptr<PG>::~intrusive_ptr (this=0x56420557f958, __in_chrg=<optimized out>) at /opt/ceph/include/boost/smart_ptr/intrusive_ptr.hpp:98
#31 0x00005641f55a80be in PG::QueuePeeringEvt<PG::RequestBackfill>::~QueuePeeringEvt (this=0x56420557f950, __in_chrg=<optimized out>) at /var/ssd/ceph/src/osd/PG.h:1914
#32 0x00005641f55a80e6 in PG::QueuePeeringEvt<PG::RequestBackfill>::~QueuePeeringEvt (this=0x56420557f950, __in_chrg=<optimized out>) at /var/ssd/ceph/src/osd/PG.h:1914
#33 0x00005641f5d5842f in SafeTimer::cancel_all_events (this=0x564202a081b8) at /var/ssd/ceph/src/common/Timer.cc:177
#34 0x00005641f5d56e97 in SafeTimer::shutdown (this=0x564202a081b8) at /var/ssd/ceph/src/common/Timer.cc:65
#35 0x00005641f52feee3 in OSDService::shutdown (this=0x564202a07928) at /var/ssd/ceph/src/osd/OSD.cc:470
#36 0x00005641f5321860 in OSD::shutdown (this=0x564202a06000) at /var/ssd/ceph/src/osd/OSD.cc:3875
...
(gdb) f 17
#17 0x00005641f5acb907 in BlueStore::OnodeSpace::clear (this=0x564201a51a68) at /var/ssd/ceph/src/os/bluestore/BlueStore.cc:1528
1528      std::lock_guard l(cache->lock);

in OSD::shutdown(), the store is deleted before OSDService::shutdown() is called, the latter calls BlueStore::OnodeSpace::clear() indirectly, which tries acquire the cache->lock. but the cache is destroyed already by then.

History

#1 Updated by Kefu Chai 7 months ago

  • Subject changed from segfault in OSDService::shutdown() to assert failure in OSDService::shutdown()

the return value was 22, as the mutex being acquired was destroyed already.

#2 Updated by Kefu Chai 7 months ago

  • Status changed from New to Need Review
  • Pull request ID set to 26043

#3 Updated by Neha Ojha 7 months ago

  • Status changed from Need Review to Testing

#4 Updated by Neha Ojha 7 months ago

  • Status changed from Testing to Resolved

Also available in: Atom PDF