Project

General

Profile

Actions

Bug #41736

closed

"ActivePyModule.cc: 54: FAILED ceph_assert(pClassInstance != nullptr)" due to race when loading modules

Added by Gavin Baker over 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After doing a yum update and service restart of a Ceph cluster, manager services crash and fail to restart. Main error appears to be: "mgr operator() Failed to run module in active mode ('rbd_support')".

Sep  9 19:13:28 ceph-mgr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Sep  9 19:13:28 ceph-mgr: -235> 2019-09-09 19:13:28.427 7fbbed24a700 -1 mgr load Failed to construct class in 'rbd_support'
Sep  9 19:13:28 ceph-mgr: -218> 2019-09-09 19:13:28.427 7fbbed24a700 -1 mgr load Traceback (most recent call last):
Sep  9 19:13:28 ceph-mgr: File "/usr/share/ceph/mgr/rbd_support/module.py", line 1326, in __init__
Sep  9 19:13:28 ceph-mgr: self.task = TaskHandler(self)
Sep  9 19:13:28 ceph-mgr: File "/usr/share/ceph/mgr/rbd_support/module.py", line 610, in __init__
Sep  9 19:13:28 ceph-mgr: self.init_task_queue()
Sep  9 19:13:28 ceph-mgr: File "/usr/share/ceph/mgr/rbd_support/module.py", line 674, in init_task_queue
Sep  9 19:13:28 ceph-mgr: self.load_task_queue(ioctx, pool_name)
Sep  9 19:13:28 ceph-mgr: File "/usr/share/ceph/mgr/rbd_support/module.py", line 708, in load_task_queue
Sep  9 19:13:28 ceph-mgr: ioctx.operate_read_op(read_op, RBD_TASK_OID)
Sep  9 19:13:28 ceph-mgr: File "rados.pyx", line 516, in rados.requires.wrapper.validate_func (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.3/rpm/el7/BUILD/ceph-14.2.3/build/src/pybind/rados/pyrex/rados.c:4721)
Sep  9 19:13:28 ceph-mgr: File "rados.pyx", line 3474, in rados.Ioctx.operate_read_op (/home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.3/rpm/el7/BUILD/ceph-14.2.3/build/src/pybind/rados/pyrex/rados.c:36554)
Sep  9 19:13:28 ceph-mgr: PermissionError: [errno 1] Failed to operate read op for oid rbd_task
Sep  9 19:13:28 ceph-mgr: -217> 2019-09-09 19:13:28.583 7fbbed24a700 -1 mgr operator() Failed to run module in active mode ('rbd_support')
Sep  9 19:13:28 ceph-mgr: -128> 2019-09-09 19:13:28.590 7fbbed24a700 -1 /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.3/rpm/el7/BUILD/ceph-14.2.3/src/mgr/ActivePyModule.cc: In function 'void ActivePyModule::notify(const string&, const string&)' thread 7fbbed24a700 time 2019-09-09 19:13:28.590091
Sep  9 19:13:28 ceph-mgr: /home/jenkins-build/build/workspace/ceph-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos7/DIST/centos7/MACHINE_SIZE/huge/release/14.2.3/rpm/el7/BUILD/ceph-14.2.3/src/mgr/ActivePyModule.cc: 54: FAILED ceph_assert(pClassInstance != nullptr)
Sep  9 19:13:28 ceph-mgr: ceph version 14.2.3 (0f776cf838a1ae3130b2b73dc26be9c95c6ccc39) nautilus (stable)
Sep  9 19:13:28 ceph-mgr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x14a) [0x7fbc0e38eac2]
Sep  9 19:13:28 ceph-mgr: 2: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7fbc0e38ec90]
Sep  9 19:13:28 ceph-mgr: 3: (ActivePyModule::notify(std::string const&, std::string const&)+0x4f5) [0x56043aea69f5]
Sep  9 19:13:28 ceph-mgr: 4: (FunctionContext::finish(int)+0x2c) [0x56043aeb8eac]
Sep  9 19:13:28 ceph-mgr: 5: (Context::complete(int)+0x9) [0x56043aeb5659]
Sep  9 19:13:28 ceph-mgr: 6: (Finisher::finisher_thread_entry()+0x156) [0x7fbc0e3d5cc6]
Sep  9 19:13:28 ceph-mgr: 7: (()+0x7dd5) [0x7fbc0bc8ddd5]
Sep  9 19:13:28 ceph-mgr: 8: (clone()+0x6d) [0x7fbc0a93702d]
Sep  9 19:13:28 ceph-mgr: -106> 2019-09-09 19:13:28.591 7fbbed24a700 -1 *** Caught signal (Aborted) **
Sep  9 19:13:28 ceph-mgr: in thread 7fbbed24a700 thread_name:mgr-fin
Sep  9 19:13:28 ceph-mgr: ceph version 14.2.3 (0f776cf838a1ae3130b2b73dc26be9c95c6ccc39) nautilus (stable)
Sep  9 19:13:28 ceph-mgr: 1: (()+0xf5d0) [0x7fbc0bc955d0]
Sep  9 19:13:28 ceph-mgr: 2: (gsignal()+0x37) [0x7fbc0a86f2c7]
Sep  9 19:13:28 ceph-mgr: 3: (abort()+0x148) [0x7fbc0a8709b8]
Sep  9 19:13:28 ceph-mgr: 4: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x7fbc0e38eb11]
Sep  9 19:13:28 ceph-mgr: 5: (ceph::__ceph_assertf_fail(char const*, char const*, int, char const*, char const*, ...)+0) [0x7fbc0e38ec90]
Sep  9 19:13:28 ceph-mgr: 6: (ActivePyModule::notify(std::string const&, std::string const&)+0x4f5) [0x56043aea69f5]
Sep  9 19:13:28 ceph-mgr: 7: (FunctionContext::finish(int)+0x2c) [0x56043aeb8eac]
Sep  9 19:13:28 ceph-mgr: 8: (Context::complete(int)+0x9) [0x56043aeb5659]
Sep  9 19:13:28 ceph-mgr: 9: (Finisher::finisher_thread_entry()+0x156) [0x7fbc0e3d5cc6]
Sep  9 19:13:28 ceph-mgr: 10: (()+0x7dd5) [0x7fbc0bc8ddd5]
Sep  9 19:13:28 ceph-mgr: 11: (clone()+0x6d) [0x7fbc0a93702d]

Related issues 2 (0 open2 closed)

Copied to mgr - Backport #46117: octopus: "ActivePyModule.cc: 54: FAILED ceph_assert(pClassInstance != nullptr)" due to race when loading modulesResolvedLaura PaduanoActions
Copied to mgr - Backport #46118: nautilus: "ActivePyModule.cc: 54: FAILED ceph_assert(pClassInstance != nullptr)" due to race when loading modulesResolvedNathan CutlerActions
Actions

Also available in: Atom PDF