Project

General

Profile

Actions

Bug #37997

closed

Bug #38094: mgr: crash list

mgr: abort, dashboard, "recursive lock of PyModuleRegistry::lock"

Added by Ernesto Puerta over 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
ceph-mgr
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Build from master (a969c9d4061713a8614b64e27fde560e1cfb2d08), containerized RHEL7 default vstart cluster. It has happened several times during restart or disabling/enabling ceph-mgr modules (dashboard). Looks like a race condition issue, a deadlock between several threads accessing config options.

Potentially troublesome code seems to point to this recently merged code dealing with module options

I'll be watchful to see if I manage to get it reproducible.

Trace dump (mgr.x.log):

 ceph version Development (no_version) nautilus (dev)
 1: (Mutex::_will_lock()+0x4c) [0x7fa7d5971ef2]
 2: (Mutex::lock(bool)+0x51) [0x7fa7d5971cfb]
 3: (std::lock_guard<Mutex>::lock_guard(Mutex&)+0x2f) [0x55b9a26f552f]
 4: (PyModuleRegistry::module_exists(std::string const&) const+0x33) [0x55b9a2706d75]
 5: (ActivePyModules::get_typed_config(std::string const&, std::string const&) const+0x50) [0x55b9a26fc3b2]
 6: (()+0x68bb59) [0x55b9a271cb59]
 7: (()+0x68be3c) [0x55b9a271ce3c]
 8: (PyEval_EvalFrameEx()+0x6df0) [0x7fa7d4582cf0]
 9: (PyEval_EvalFrameEx()+0x67bd) [0x7fa7d45826bd]
 10: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 11: (PyEval_EvalFrameEx()+0x663c) [0x7fa7d458253c]
 12: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 13: (()+0x70978) [0x7fa7d450e978]
 14: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 15: (PyEval_EvalFrameEx()+0x17fd) [0x7fa7d457d6fd]
 16: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 17: (PyEval_EvalFrameEx()+0x663c) [0x7fa7d458253c]
 18: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 19: (PyEval_EvalFrameEx()+0x663c) [0x7fa7d458253c]
 20: (PyEval_EvalFrameEx()+0x67bd) [0x7fa7d45826bd]
 21: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 22: (()+0x70978) [0x7fa7d450e978]
 23: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 24: (()+0x5aa55) [0x7fa7d44f8a55]
 25: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 26: (()+0xa2e27) [0x7fa7d4540e27]
 27: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 28: (PyEval_EvalFrameEx()+0x2336) [0x7fa7d457e236]
 29: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 30: (()+0x70978) [0x7fa7d450e978]
 31: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 32: (()+0x5aa55) [0x7fa7d44f8a55]
 33: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 34: (()+0xa2a87) [0x7fa7d4540a87]
 35: (()+0xa179f) [0x7fa7d453f79f]
 36: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 37: (PyEval_CallObjectWithKeywords()+0x47) [0x7fa7d457b8f7]
 38: (ActivePyModule::load(ActivePyModules*)+0x127) [0x55b9a26f2c13]
 39: (ActivePyModules::start_one(std::shared_ptr<PyModule>)+0x155) [0x55b9a26faaff]
 40: (PyModuleRegistry::active_start(DaemonStateIndex&, ClusterState&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, MonClient&, std::shared_ptr<LogChannel>, std::shared_ptr<LogChannel>, Objecter&, Client&, Finisher&, DaemonServer&)+0x4ee) [0x55b9a282deb0]
 41: (Mgr::init()+0xcc4) [0x55b9a27dd8c6]
 42: (()+0x74a9d6) [0x55b9a27db9d6]
 43: (()+0x751b54) [0x55b9a27e2b54]
 44: (boost::function1<void, int>::operator()(int) const+0x6c) [0x55b9a270789c]
 45: (FunctionContext::finish(int)+0x24) [0x55b9a2704624]
 46: (Context::complete(int)+0x27) [0x55b9a27044eb]
 47: (Finisher::finisher_thread_entry()+0x38b) [0x7fa7d5918cf9]
 48: (Finisher::FinisherThread::entry()+0x1c) [0x55b9a27e39a4]
 49: (Thread::entry_wrapper()+0x78) [0x7fa7d59859dc]
 50: (Thread::_entry_func(void*)+0x18) [0x7fa7d598595a]
 51: (()+0x7dd5) [0x7fa7d216ddd5]
 52: (clone()+0x6d) [0x7fa7d0e1dead]

    -1> 2019-01-21 20:01:02.051 7fa7b697c700 -1 /ceph/src/common/lockdep.cc: In function 'int lockdep_will_lock(const char*, int, bool, bool)' thread 7fa7b697c700 time 2019-01-21 20:01:02.041972
/ceph/src/common/lockdep.cc: 305: abort()

 ceph version Development (no_version) nautilus (dev)
 1: (ceph::__ceph_abort(char const*, int, char const*, std::string const&)+0xfe) [0x7fa7d59e54c8]
 2: (lockdep_will_lock(char const*, int, bool, bool)+0x423) [0x7fa7d5af9d61]
 3: (Mutex::_will_lock()+0x4c) [0x7fa7d5971ef2]
 4: (Mutex::lock(bool)+0x51) [0x7fa7d5971cfb]
 5: (std::lock_guard<Mutex>::lock_guard(Mutex&)+0x2f) [0x55b9a26f552f]
 6: (PyModuleRegistry::module_exists(std::string const&) const+0x33) [0x55b9a2706d75]
 7: (ActivePyModules::get_typed_config(std::string const&, std::string const&) const+0x50) [0x55b9a26fc3b2]
 8: (()+0x68bb59) [0x55b9a271cb59]
 9: (()+0x68be3c) [0x55b9a271ce3c]
 10: (PyEval_EvalFrameEx()+0x6df0) [0x7fa7d4582cf0]
 11: (PyEval_EvalFrameEx()+0x67bd) [0x7fa7d45826bd]
 12: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 13: (PyEval_EvalFrameEx()+0x663c) [0x7fa7d458253c]
 14: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 15: (()+0x70978) [0x7fa7d450e978]
 16: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 17: (PyEval_EvalFrameEx()+0x17fd) [0x7fa7d457d6fd]
 18: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 19: (PyEval_EvalFrameEx()+0x663c) [0x7fa7d458253c]
 20: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 21: (PyEval_EvalFrameEx()+0x663c) [0x7fa7d458253c]
 22: (PyEval_EvalFrameEx()+0x67bd) [0x7fa7d45826bd]
 23: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 24: (()+0x70978) [0x7fa7d450e978]
 25: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 26: (()+0x5aa55) [0x7fa7d44f8a55]
 27: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 28: (()+0xa2e27) [0x7fa7d4540e27]
 29: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 30: (PyEval_EvalFrameEx()+0x2336) [0x7fa7d457e236]
 31: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 32: (()+0x70978) [0x7fa7d450e978]
 33: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 34: (()+0x5aa55) [0x7fa7d44f8a55]
 35: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 36: (()+0xa2a87) [0x7fa7d4540a87]
 37: (()+0xa179f) [0x7fa7d453f79f]
 38: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 39: (PyEval_CallObjectWithKeywords()+0x47) [0x7fa7d457b8f7]
 40: (ActivePyModule::load(ActivePyModules*)+0x127) [0x55b9a26f2c13]
 41: (ActivePyModules::start_one(std::shared_ptr<PyModule>)+0x155) [0x55b9a26faaff]
 42: (PyModuleRegistry::active_start(DaemonStateIndex&, ClusterState&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, MonClient&, std::shared_ptr<LogChannel>, std::shared_ptr<LogChannel>, Objecter&, Client&, Finisher&, DaemonServer&)+0x4ee) [0x55b9a282deb0]
 43: (Mgr::init()+0xcc4) [0x55b9a27dd8c6]
 44: (()+0x74a9d6) [0x55b9a27db9d6]
 45: (()+0x751b54) [0x55b9a27e2b54]
 46: (boost::function1<void, int>::operator()(int) const+0x6c) [0x55b9a270789c]
 47: (FunctionContext::finish(int)+0x24) [0x55b9a2704624]
 48: (Context::complete(int)+0x27) [0x55b9a27044eb]
 49: (Finisher::finisher_thread_entry()+0x38b) [0x7fa7d5918cf9]
 50: (Finisher::FinisherThread::entry()+0x1c) [0x55b9a27e39a4]
 51: (Thread::entry_wrapper()+0x78) [0x7fa7d59859dc]
 52: (Thread::_entry_func(void*)+0x18) [0x7fa7d598595a]
 53: (()+0x7dd5) [0x7fa7d216ddd5]
 54: (clone()+0x6d) [0x7fa7d0e1dead]

     0> 2019-01-21 20:01:02.063 7fa7b697c700 -1 *** Caught signal (Aborted) **
 in thread 7fa7b697c700 thread_name:mgr-fin

 ceph version Development (no_version) nautilus (dev)
 1: (()+0x93c7a0) [0x55b9a29cd7a0]
 2: (()+0xf5d0) [0x7fa7d21755d0]
 3: (gsignal()+0x37) [0x7fa7d0d56207]
 4: (abort()+0x148) [0x7fa7d0d578f8]
 5: (ceph::__ceph_abort(char const*, int, char const*, std::string const&)+0x377) [0x7fa7d59e5741]
 6: (lockdep_will_lock(char const*, int, bool, bool)+0x423) [0x7fa7d5af9d61]
 7: (Mutex::_will_lock()+0x4c) [0x7fa7d5971ef2]
 8: (Mutex::lock(bool)+0x51) [0x7fa7d5971cfb]
 9: (std::lock_guard<Mutex>::lock_guard(Mutex&)+0x2f) [0x55b9a26f552f]
 10: (PyModuleRegistry::module_exists(std::string const&) const+0x33) [0x55b9a2706d75]
 11: (ActivePyModules::get_typed_config(std::string const&, std::string const&) const+0x50) [0x55b9a26fc3b2]
 12: (()+0x68bb59) [0x55b9a271cb59]
 13: (()+0x68be3c) [0x55b9a271ce3c]
 14: (PyEval_EvalFrameEx()+0x6df0) [0x7fa7d4582cf0]
 15: (PyEval_EvalFrameEx()+0x67bd) [0x7fa7d45826bd]
 16: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 17: (PyEval_EvalFrameEx()+0x663c) [0x7fa7d458253c]
 18: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 19: (()+0x70978) [0x7fa7d450e978]
 20: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 21: (PyEval_EvalFrameEx()+0x17fd) [0x7fa7d457d6fd]
 22: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 23: (PyEval_EvalFrameEx()+0x663c) [0x7fa7d458253c]
 24: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 25: (PyEval_EvalFrameEx()+0x663c) [0x7fa7d458253c]
 26: (PyEval_EvalFrameEx()+0x67bd) [0x7fa7d45826bd]
 27: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 28: (()+0x70978) [0x7fa7d450e978]
 29: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 30: (()+0x5aa55) [0x7fa7d44f8a55]
 31: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 32: (()+0xa2e27) [0x7fa7d4540e27]
 33: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 34: (PyEval_EvalFrameEx()+0x2336) [0x7fa7d457e236]
 35: (PyEval_EvalCodeEx()+0x7ed) [0x7fa7d458503d]
 36: (()+0x70978) [0x7fa7d450e978]
 37: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 38: (()+0x5aa55) [0x7fa7d44f8a55]
 39: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 40: (()+0xa2a87) [0x7fa7d4540a87]
 41: (()+0xa179f) [0x7fa7d453f79f]
 42: (PyObject_Call()+0x43) [0x7fa7d44e9a63]
 43: (PyEval_CallObjectWithKeywords()+0x47) [0x7fa7d457b8f7]
 44: (ActivePyModule::load(ActivePyModules*)+0x127) [0x55b9a26f2c13]
 45: (ActivePyModules::start_one(std::shared_ptr<PyModule>)+0x155) [0x55b9a26faaff]
 46: (PyModuleRegistry::active_start(DaemonStateIndex&, ClusterState&, std::map<std::string, std::string, std::less<std::string>, std::allocator<std::pair<std::string const, std::string> > > const&, MonClient&, std::shared_ptr<LogChannel>, std::shared_ptr<LogChannel>, Objecter&, Client&, Finisher&, DaemonServer&)+0x4ee) [0x55b9a282deb0]
 47: (Mgr::init()+0xcc4) [0x55b9a27dd8c6]
 48: (()+0x74a9d6) [0x55b9a27db9d6]
 49: (()+0x751b54) [0x55b9a27e2b54]
 50: (boost::function1<void, int>::operator()(int) const+0x6c) [0x55b9a270789c]
 51: (FunctionContext::finish(int)+0x24) [0x55b9a2704624]
 52: (Context::complete(int)+0x27) [0x55b9a27044eb]
 53: (Finisher::finisher_thread_entry()+0x38b) [0x7fa7d5918cf9]
 54: (Finisher::FinisherThread::entry()+0x1c) [0x55b9a27e39a4]
 55: (Thread::entry_wrapper()+0x78) [0x7fa7d59859dc]
 56: (Thread::_entry_func(void*)+0x18) [0x7fa7d598595a]
 57: (()+0x7dd5) [0x7fa7d216ddd5]
 58: (clone()+0x6d) [0x7fa7d0e1dead]

Actions

Also available in: Atom PDF