Actions
Bug #40156
closeddeadlock on MonCommandCompletion
Status:
Duplicate
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Description
Thread 16 (Thread 0x7f70a0049700 (LWP 982531)): #0 0x00007f70abb5c827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x274f900) at ../sysdeps/unix/sysv/linux/futex-internal.h:205 #1 do_futex_wait (sem=sem@entry=0x274f900, abstime=0x0) at sem_waitcommon.c:111 #2 0x00007f70abb5c8d4 in __new_sem_wait_slow (sem=0x274f900, abstime=0x0) at sem_waitcommon.c:181 #3 0x00007f70abb5c97a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29 #4 0x00007f70ac0affe8 in PyThread_acquire_lock () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #5 0x00007f70ac085586 in PyEval_EvalFrameEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #6 0x00007f70ac1c305c in PyEval_EvalCodeEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #7 0x00007f70ac119370 in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #8 0x00007f70ac0ec273 in PyObject_Call () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #9 0x00007f70ac1603ac in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #10 0x00007f70ac0ec273 in PyObject_Call () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #11 0x00007f70ac0ec6df in PyObject_CallFunctionObjArgs () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #12 0x00007f70ac088f5d in PyEval_EvalFrameEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #13 0x00007f70ac08c044 in PyEval_EvalFrameEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 ---Type <return> to continue, or q <return> to quit--- #14 0x00007f70ac1c305c in PyEval_EvalCodeEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #15 0x00007f70ac119370 in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #16 0x00007f70ac0ec273 in PyObject_Call () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #17 0x00007f70ac1603ac in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #18 0x00007f70ac0ec273 in PyObject_Call () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #19 0x00007f70ac1c2487 in PyEval_CallObjectWithKeywords () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #20 0x0000000000530559 in MonCommandCompletion::finish (this=0x82aaa50, r=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/src/mgr/BaseMgrModule.cc:100 #21 0x000000000051878c in Context::complete (r=<optimized out>, this=0x82aaa50) at /build/ceph-14.2.1-198-g869a6a3/src/include/Context.h:77 #22 <lambda(int)>::<lambda(int)>::operator() (__closure=<optimized out>, __closure=<optimized out>, wait_r=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/src/mgr/BaseMgrModule.cc:157 #23 boost::detail::function::void_function_obj_invoker1<ceph_send_command(BaseMgrModule*, PyObject*)::<lambda(int)>::<lambda(int)>, void, int>::invoke(boost::detail::function::function_buffer &, int) (function_obj_ptr=..., a0=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/obj-x86_64-linux-gnu/boost/include/boost/function/function_template.hpp:159 #24 0x0000000000514639 in boost::function1<void, int>::operator() (a0=<optimized out>, this=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/obj-x86_64-linux-gnu/boost/include/boost/function/function_template.hpp:768 #25 FunctionContext::finish (this=<optimized out>, r=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/src/include/Context.h:487 #26 0x0000000000511419 in Context::complete (this=0x1ae61380, r=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/src/include/Context.h:77 #27 0x00000000005c72ce in Objecter::get_latest_version (this=<optimized out>, oldest=<optimized out>, newest=1104054, fin=0x1ae61380) at /build/ceph-14.2.1-198-g869a6a3/src/osdc/Objecter.cc:1953 #28 0x00000000005f9a1d in C_Objecter_GetVersion::finish (this=<optimized out>, r=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/src/osdc/Objecter.cc:1929 #29 0x0000000000511419 in Context::complete (this=0x1694d510, r=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/src/include/Context.h:77 #30 0x00007f70ac89375e in Finisher::finisher_thread_entry() () from target:/usr/lib/ceph/libceph-common.so.0 #31 0x00007f70abb546ba in start_thread (arg=0x7f70a0049700) at pthread_create.c:333 #32 0x00007f70ab37d41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
possibly other threads of note,
Thread 21 (Thread 0x7f7097327700 (LWP 982544)): #0 0x00007f70abb5c827 in futex_abstimed_wait_cancelable (private=0, abstime=0x0, expected=0, futex_word=0x274f900) at ../sysdeps/unix/sysv/linux/futex-internal.h:205 #1 do_futex_wait (sem=sem@entry=0x274f900, abstime=0x0) at sem_waitcommon.c:111 #2 0x00007f70abb5c8d4 in __new_sem_wait_slow (sem=0x274f900, abstime=0x0) at sem_waitcommon.c:181 #3 0x00007f70abb5c97a in __new_sem_wait (sem=<optimized out>) at sem_wait.c:29 #4 0x00007f70ac0affe8 in PyThread_acquire_lock () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #5 0x00007f70ac084926 in PyEval_RestoreThread () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #6 0x000000000057fdd5 in Gil::Gil (this=0x7f7097324700, ts=..., new_thread=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/src/mgr/Gil.cc:37 #7 0x0000000000503698 in ActivePyModule::notify_clog (this=0x4b8c780, log_entry=...) at /build/ceph-14.2.1-198-g869a6a3/src/mgr/ActivePyModule.cc:79 #8 0x0000000000514639 in boost::function1<void, int>::operator() (a0=<optimized out>, this=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/obj-x86_64-linux-gnu/boost/include/boost/function/function_template.hpp:768 #9 FunctionContext::finish (this=<optimized out>, r=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/src/include/Context.h:487 #10 0x0000000000511419 in Context::complete (this=0x1d0a0cd0, r=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/src/include/Context.h:77 #11 0x00007f70ac89375e in Finisher::finisher_thread_entry() () from target:/usr/lib/ceph/libceph-common.so.0 #12 0x00007f70abb546ba in start_thread (arg=0x7f7097327700) at pthread_create.c:333 #13 0x00007f70ab37d41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109 Thread 27 (Thread 0x7f7093b20700 (LWP 982551)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x000000000050d798 in Cond::Wait (mutex=..., this=0x7f7093b1c890) at /build/ceph-14.2.1-198-g869a6a3/src/common/Cond.h:49 #2 C_SaferCond::wait (this=0x7f7093b1c828) at /build/ceph-14.2.1-198-g869a6a3/src/common/Cond.h:196 #3 Command::wait (this=0x7f7093b1c820) at /build/ceph-14.2.1-198-g869a6a3/src/mgr/MgrContext.h:39 #4 ActivePyModules::set_store (this=this@entry=0x4c38280, module_name=..., key=..., val=...) at /build/ceph-14.2.1-198-g869a6a3/src/mgr/ActivePyModules.cc:626 #5 0x000000000051963a in ceph_store_set (self=<optimized out>, args=<optimized out>) at /build/ceph-14.2.1-198-g869a6a3/src/mgr/BaseMgrModule.cc:484 #6 0x00007f70ac08d971 in PyEval_EvalFrameEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #7 0x00007f70ac08c044 in PyEval_EvalFrameEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #8 0x00007f70ac1c305c in PyEval_EvalCodeEx () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #9 0x00007f70ac119370 in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #10 0x00007f70ac0ec273 in PyObject_Call () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #11 0x00007f70ac1603ac in ?? () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #12 0x00007f70ac0ec273 in PyObject_Call () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #13 0x00007f70ac0ed444 in PyObject_CallMethod () from target:/usr/lib/x86_64-linux-gnu/libpython2.7.so.1.0 #14 0x00000000005b1fab in PyModuleRunner::serve (this=0x4b8cf00) at /build/ceph-14.2.1-198-g869a6a3/src/mgr/PyModuleRunner.cc:47 #15 0x00000000005b2605 in PyModuleRunner::PyModuleRunnerThread::entry (this=0x4b8cf48) at /build/ceph-14.2.1-198-g869a6a3/src/mgr/PyModuleRunner.cc:106 #16 0x00007f70abb546ba in start_thread (arg=0x7f7093b20700) at pthread_create.c:333 #17 0x00007f70ab37d41d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:109
this was on the lab cluster.
some mgr command were fine (e.g., 'ceph pg ls'), but the module commands ('ceph crash ls') would hang.
Updated by haitao chen almost 5 years ago
Is this same as http://tracker.ceph.com/issues/38537?
Updated by Kefu Chai over 4 years ago
- Status changed from 12 to Fix Under Review
- Pull request ID set to 30468
Updated by Kefu Chai over 4 years ago
should have been fixed by https://github.com/ceph/ceph/commit/5108860c385cc1d905588d2e92d80295e3222ca4
Updated by Kefu Chai over 4 years ago
- Status changed from Fix Under Review to Pending Backport
- Assignee deleted (
Kefu Chai) - Backport set to luminous,nautilus,mimic
- Pull request ID changed from 30468 to 27280
Updated by Nathan Cutler over 4 years ago
- Status changed from Pending Backport to Duplicate
Duplicate of #39040
Updated by Nathan Cutler over 4 years ago
- Is duplicate of Bug #39040: mgr: deadlock added
Updated by Nathan Cutler over 4 years ago
- Backport deleted (
luminous,nautilus,mimic)
Already backported via #39040
Actions