Project

General

Profile

Actions

Bug #39363

closed

deadlock from moncommand completion

Added by Sage Weil about 5 years ago. Updated almost 5 years ago.

Status:
Duplicate
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

thread 28
activepymodules::lock ActivePyModules::get_osdmap
clusterstate::lock
objecter::rwlock (blocked)

thread 13
objecter::rwlock
activepymodules::lock (blocked) ... deadlock!

Thread 13 (Thread 0x7fe104d89700 (LWP 1773921)):
#0  __lll_lock_wait () at ../sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135
#1  0x00007fe10f093dbd in __GI___pthread_mutex_lock (mutex=0x6c4eee0) at ../nptl/pthread_mutex_lock.c:80
#2  0x00007fe10fe01d49 in Mutex::lock(bool) () from target:/usr/lib/ceph/libceph-common.so.0
#3  0x0000000000508887 in std::lock_guard<Mutex>::lock_guard (__m=..., this=<synthetic pointer>) at /usr/include/c++/7/bits/std_mutex.h:162
#4  ActivePyModules::notify_all (this=this@entry=0x6c4edc0, notify_type=..., notify_id=...) at /build/ceph-14.2.0-302-gf81d225/src/mgr/ActivePyModules.cc:443
#5  0x000000000052d699 in MonCommandCompletion::finish (this=0x127e3290, r=<optimized out>) at /build/ceph-14.2.0-302-gf81d225/src/mgr/BaseMgrModule.cc:110
#6  0x000000000050e9e9 in Context::complete (this=0x127e3290, r=<optimized out>) at /build/ceph-14.2.0-302-gf81d225/src/include/Context.h:77
#7  0x00000000005d5974 in Objecter::_finish_command (this=this@entry=0x7ffe19a052c8, c=c@entry=0x14774a80, r=0, rs=...) at /build/ceph-14.2.0-302-gf81d225/src/osdc/Objecter.cc:4928
#8  0x00000000005d63e0 in Objecter::handle_command_reply (this=this@entry=0x7ffe19a052c8, m=m@entry=0x1028f2a0) at /build/ceph-14.2.0-302-gf81d225/src/osdc/Objecter.cc:4765
#9  0x00000000005e43bb in Objecter::ms_dispatch (this=0x7ffe19a052c8, m=0x1028f2a0) at /build/ceph-14.2.0-302-gf81d225/src/osdc/Objecter.cc:980
#10 0x0000000000576756 in Dispatcher::ms_dispatch2 (this=0x7ffe19a052d0, m=...) at /build/ceph-14.2.0-302-gf81d225/src/msg/Dispatcher.h:126
#11 0x00007fe10ffa47f9 in DispatchQueue::entry() () from target:/usr/lib/ceph/libceph-common.so.0
#12 0x00007fe1100545ad in DispatchQueue::DispatchThread::entry() () from target:/usr/lib/ceph/libceph-common.so.0
#13 0x00007fe10f0916ba in start_thread (arg=0x7fe104d89700) at pthread_create.c:333
#14 0x00007fe10e8ba41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

Thread 28 (Thread 0x7fe0f438f700 (LWP 1773989)):
#0  0x00007fe10f0960cc in futex_wait (private=<optimized out>, expected=3, futex_word=0x7ffe19a053b8) at ../sysdeps/unix/sysv/linux/futex-internal.h:61
#1  futex_wait_simple (private=<optimized out>, expected=3, futex_word=0x7ffe19a053b8) at ../sysdeps/nptl/futex-internal.h:135
#2  __pthread_rwlock_rdlock_slow (rwlock=rwlock@entry=0x7ffe19a053b0) at pthread_rwlock_rdlock.c:68
#3  0x00007fe10f096331 in __GI___pthread_rwlock_rdlock (rwlock=rwlock@entry=0x7ffe19a053b0) at pthread_rwlock_rdlock.c:177
#4  0x0000000000511d78 in std::__shared_mutex_pthread::lock_shared (this=<optimized out>) at /usr/include/c++/7/shared_mutex:139
#5  std::shared_mutex::lock_shared (this=<optimized out>) at /usr/include/c++/7/shared_mutex:335
#6  boost::shared_lock<std::shared_mutex>::lock (this=this@entry=0x7fe0f438bd40) at /build/ceph-14.2.0-302-gf81d225/obj-x86_64-linux-gnu/boost/include/boost/thread/lock_types.hpp:645
#7  0x000000000050d9b4 in boost::shared_lock<std::shared_mutex>::shared_lock (m_=..., this=0x7fe0f438bd40) at /build/ceph-14.2.0-302-gf81d225/obj-x86_64-linux-gnu/boost/include/boost/thread/lock_types.hpp:520
#8  Objecter::with_osdmap<ActivePyModules::get_osdmap()::<lambda(const OSDMap&)> > (cb=<optimized out>, this=0x7ffe19a052c8) at /build/ceph-14.2.0-302-gf81d225/src/osdc/Objecter.h:2055
#9  ClusterState::with_osdmap<ActivePyModules::get_osdmap()::<lambda(const OSDMap&)> > (this=<optimized out>) at /build/ceph-14.2.0-302-gf81d225/src/mgr/ClusterState.h:127
#10 ActivePyModules::get_osdmap (this=0x6c4edc0) at /build/ceph-14.2.0-302-gf81d225/src/mgr/ActivePyModules.cc:873
#11 0x00007fe10f5ca87c in PyEval_EvalFrameEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#12 0x00007fe10f5c9084 in PyEval_EvalFrameEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#13 0x00007fe10f5c9084 in PyEval_EvalFrameEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#14 0x00007fe10f70011c in PyEval_EvalCodeEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#15 0x00007fe10f6563b0 in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#16 0x00007fe10f6292b3 in PyObject_Call () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#17 0x00007fe10f69d46c in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#18 0x00007fe10f6292b3 in PyObject_Call () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#19 0x00007fe10f62a484 in PyObject_CallMethod () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0
#20 0x00000000005af48b in PyModuleRunner::serve (this=0x6dc2a80) at /build/ceph-14.2.0-302-gf81d225/src/mgr/PyModuleRunner.cc:47
#21 0x00000000005afae5 in PyModuleRunner::PyModuleRunnerThread::entry (this=0x6dc2ac8) at /build/ceph-14.2.0-302-gf81d225/src/mgr/PyModuleRunner.cc:106
#22 0x00007fe10f0916ba in start_thread (arg=0x7fe0f438f700) at pthread_create.c:333
#23 0x00007fe10e8ba41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109

deadlock between objecter lock and activepymodules::lock


Related issues 1 (0 open1 closed)

Is duplicate of mgr - Bug #39335: deadlock on command completionResolved04/16/2019

Actions
Actions #1

Updated by Sage Weil almost 5 years ago

  • Status changed from In Progress to Pending Backport
  • Backport set to nautilus
Actions #3

Updated by Lenz Grimmer almost 5 years ago

  • Target version set to v15.0.0
  • Pull request ID set to 27619
Actions #4

Updated by Sebastian Wagner almost 5 years ago

  • Is duplicate of Bug #39335: deadlock on command completion added
Actions #5

Updated by Sebastian Wagner almost 5 years ago

  • Status changed from Pending Backport to Duplicate
Actions #6

Updated by Nathan Cutler almost 5 years ago

  • Backport deleted (nautilus)
Actions

Also available in: Atom PDF