Project

General

Profile

Actions

Bug #44241

closed

mgr: deadlock w/ register/unregister_client

Added by Sage Weil about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Thread 28 (Thread 0x7f99e584f700 (LWP 67)):
#0  0x00007f9a09ee78dd in __lll_lock_wait () from /lib64/libpthread.so.0
--Type <RET> for more, q to quit, c to continue without paging--
#1  0x00007f9a09ee0af9 in pthread_mutex_lock () from /lib64/libpthread.so.0
#2  0x0000560e02b05507 in __gthread_mutex_lock (__mutex=<optimized out>) at /usr/include/c++/8/x86_64-redhat-linux/bits/gthr-default.h:748
#3  std::mutex::lock (this=<optimized out>) at /usr/include/c++/8/bits/std_mutex.h:103
#4  0x0000560e02b03264 in std::lock_guard<std::mutex>::lock_guard (__m=..., this=<synthetic pointer>) at /usr/include/c++/8/bits/std_mutex.h:161
#5  ActivePyModules::register_client (this=this@entry=0x560e0a20e000, name=..., addrs="172.21.15.173:0/1983118714") at /usr/src/debug/ceph-15.1.0-987.g53791ea.el8.x86_64/src/mgr/ActivePyModules.cc:1086
#6  0x0000560e02b0c7c2 in ceph_register_client (self=<optimized out>, args=<optimized out>) at /usr/include/c++/8/string_view:105
#7  0x00007f9a0b1ae4c2 in _PyCFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#8  0x00007f9a0b1bc94d in call_function () from /lib64/libpython3.6m.so.1.0
#9  0x00007f9a0b1eca6a in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#10 0x00007f9a0b1c6da7 in PyEval_EvalCodeEx () from /lib64/libpython3.6m.so.1.0
#11 0x00007f9a0b1c7cd3 in function_call () from /lib64/libpython3.6m.so.1.0
#12 0x00007f9a0b140993 in PyObject_Call () from /lib64/libpython3.6m.so.1.0
#13 0x00007f9a0b156c1e in property_descr_get () from /lib64/libpython3.6m.so.1.0
#14 0x00007f9a0b14368d in _PyObject_GenericGetAttrWithDict () from /lib64/libpython3.6m.so.1.0
#15 0x00007f9a0b1ecafe in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#16 0x00007f9a0b13ebe8 in _PyEval_EvalCodeWithName () from /lib64/libpython3.6m.so.1.0
#17 0x00007f9a0b1785e0 in fast_function () from /lib64/libpython3.6m.so.1.0
#18 0x00007f9a0b1bca32 in call_function () from /lib64/libpython3.6m.so.1.0
#19 0x00007f9a0b1eca6a in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#20 0x00007f9a0b17842a in fast_function () from /lib64/libpython3.6m.so.1.0
#21 0x00007f9a0b1bca32 in call_function () from /lib64/libpython3.6m.so.1.0
#22 0x00007f9a0b1eca6a in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#23 0x00007f9a0b13fc0a in _PyFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#24 0x00007f9a0b1405ee in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#25 0x00007f9a0b14ba80 in _PyObject_Call_Prepend () from /lib64/libpython3.6m.so.1.0
#26 0x00007f9a0b14035b in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#27 0x00007f9a0b1b8800 in PyObject_CallMethod () from /lib64/libpython3.6m.so.1.0
#28 0x0000560e02bae536 in PyModuleRunner::serve (this=0x560e042f5010) at /usr/src/debug/ceph-15.1.0-987.g53791ea.el8.x86_64/src/mgr/PyModuleRunner.cc:47
#29 0x0000560e02baee15 in PyModuleRunner::PyModuleRunnerThread::entry (this=0x560e042f5058) at /usr/src/debug/ceph-15.1.0-987.g53791ea.el8.x86_64/src/mgr/PyModuleRunner.cc:108
#30 0x00007f9a09ede2de in start_thread () from /lib64/libpthread.so.0
#31 0x00007f9a08a71133 in clone () from /lib64/libc.so.6

blocked on ActivePyModules::lock
Thread 40 (Thread 0x7f99d90fd700 (LWP 87)):
#0  0x00007f9a09ee47ca in pthread_cond_timedwait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0
#1  0x00007f9a0b130ed6 in PyEval_RestoreThread () from /lib64/libpython3.6m.so.1.0
#2  0x0000560e02afdce4 in ActivePyModules::get_store (this=this@entry=0x560e0a20e000, module_name="telemetry", key="report_id", val=val@entry=0x7f99d90f9960) at /usr/src/debug/ceph-15.1.0-987.g53791ea.el8.x86_64/src/mgr/ActivePy$
#3  0x0000560e02b1233a in ceph_store_get (self=<optimized out>, args=<optimized out>) at /usr/include/c++/8/x86_64-redhat-linux/bits/gthr-default.h:778
#4  0x00007f9a0b1ae4c2 in _PyCFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#5  0x00007f9a0b1bc94d in call_function () from /lib64/libpython3.6m.so.1.0
#6  0x00007f9a0b1eca6a in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#7  0x00007f9a0b13ebe8 in _PyEval_EvalCodeWithName () from /lib64/libpython3.6m.so.1.0
#8  0x00007f9a0b1785e0 in fast_function () from /lib64/libpython3.6m.so.1.0
#9  0x00007f9a0b1bca32 in call_function () from /lib64/libpython3.6m.so.1.0
#10 0x00007f9a0b1eca6a in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#11 0x00007f9a0b17842a in fast_function () from /lib64/libpython3.6m.so.1.0
#12 0x00007f9a0b1bca32 in call_function () from /lib64/libpython3.6m.so.1.0
#13 0x00007f9a0b1eca6a in _PyEval_EvalFrameDefault () from /lib64/libpython3.6m.so.1.0
#14 0x00007f9a0b13fc0a in _PyFunction_FastCallDict () from /lib64/libpython3.6m.so.1.0
#15 0x00007f9a0b1405ee in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#16 0x00007f9a0b14ba80 in _PyObject_Call_Prepend () from /lib64/libpython3.6m.so.1.0
#17 0x00007f9a0b14035b in _PyObject_FastCallDict () from /lib64/libpython3.6m.so.1.0
#18 0x00007f9a0b1b8800 in PyObject_CallMethod () from /lib64/libpython3.6m.so.1.0
#19 0x0000560e02bae536 in PyModuleRunner::serve (this=0x560e042f5910) at /usr/src/debug/ceph-15.1.0-987.g53791ea.el8.x86_64/src/mgr/PyModuleRunner.cc:47
#20 0x0000560e02baee15 in PyModuleRunner::PyModuleRunnerThread::entry (this=0x560e042f5958) at /usr/src/debug/ceph-15.1.0-987.g53791ea.el8.x86_64/src/mgr/PyModuleRunner.cc:108
#21 0x00007f9a09ede2de in start_thread () from /lib64/libpthread.so.0
#22 0x00007f9a08a71133 in clone () from /lib64/libc.so.6

locked ActivePyModules::lock
blocked on
PyEval_RestoreThread(tstate); *
Actions #1

Updated by Sage Weil about 4 years ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 33464
Actions #2

Updated by Sebastian Wagner about 4 years ago

  • Project changed from RADOS to mgr
Actions #3

Updated by Sage Weil about 4 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF