Project

General

Profile

Bug #45591

mgr: FAILED ceph_assert(daemon != nullptr)

Added by Patrick Donnelly over 2 years ago. Updated 5 months ago.

Status:
Pending Backport
Priority:
Urgent
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
backport_processed
Backport:
pacific,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

2020-05-16T11:54:45.842 INFO:tasks.ceph.mgr.x.smithi083.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-1640-g543315c9344/rpm/el8/BUILD/ceph-16.0.0-1640-g543315c9344/src/mgr/DaemonServer.cc: In function 'bool DaemonServer::handle_report(ceph::ref_t<MMgrReport>&)' thread 7fe985580700 time 2020-05-16T11:54:45.841099+0000
2020-05-16T11:54:45.842 INFO:tasks.ceph.mgr.x.smithi083.stderr:/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/16.0.0-1640-g543315c9344/rpm/el8/BUILD/ceph-16.0.0-1640-g543315c9344/src/mgr/DaemonServer.cc: 610: FAILED ceph_assert(daemon != nullptr)
2020-05-16T11:54:45.844 INFO:tasks.ceph.mgr.x.smithi083.stderr: ceph version 16.0.0-1640-g543315c9344 (543315c934420269aa12ef2f9dec2c9eadb4fa6f) pacific (dev)
2020-05-16T11:54:45.844 INFO:tasks.ceph.mgr.x.smithi083.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x158) [0x7fe9ab892d90]
2020-05-16T11:54:45.844 INFO:tasks.ceph.mgr.x.smithi083.stderr: 2: (()+0x275faa) [0x7fe9ab892faa]
2020-05-16T11:54:45.845 INFO:tasks.ceph.mgr.x.smithi083.stderr: 3: (DaemonServer::handle_report(boost::intrusive_ptr<MMgrReport> const&)+0x13fd) [0x557354eaa34d]
2020-05-16T11:54:45.845 INFO:tasks.ceph.mgr.x.smithi083.stderr: 4: (DaemonServer::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x177) [0x557354ec0e17]
2020-05-16T11:54:45.845 INFO:tasks.ceph.mgr.x.smithi083.stderr: 5: (DispatchQueue::entry()+0x126a) [0x7fe9abab1efa]
2020-05-16T11:54:45.846 INFO:tasks.ceph.mgr.x.smithi083.stderr: 6: (DispatchQueue::DispatchThread::entry()+0x11) [0x7fe9abb549d1]
2020-05-16T11:54:45.846 INFO:tasks.ceph.mgr.x.smithi083.stderr: 7: (()+0x82de) [0x7fe9a9d1f2de]
2020-05-16T11:54:45.846 INFO:tasks.ceph.mgr.x.smithi083.stderr: 8: (clone()+0x43) [0x7fe9a88b2133]
2020-05-16T11:54:45.847 INFO:tasks.ceph.mgr.x.smithi083.stderr:*** Caught signal (Aborted) **
2020-05-16T11:54:45.847 INFO:tasks.ceph.mgr.x.smithi083.stderr: in thread 7fe985580700 thread_name:ms_dispatch

From: /ceph/teuthology-archive/pdonnell-2020-05-16_06:07:05-fs-wip-pdonnell-testing-20200516.030215-distro-basic-smithi/5060503/teuthology.log


Related issues

Copied to mgr - Backport #57473: quincy: mgr: FAILED ceph_assert(daemon != nullptr) New
Copied to mgr - Backport #57474: pacific: mgr: FAILED ceph_assert(daemon != nullptr) New

History

#1 Updated by Patrick Donnelly almost 2 years ago

/ceph/teuthology-archive/pdonnell-2021-05-01_09:07:09-fs-wip-pdonnell-testing-20210501.040415-distro-basic-smithi/6087856/teuthology.log

#2 Updated by Patrick Donnelly over 1 year ago

  • Priority changed from High to Urgent

/ceph/teuthology-archive/pdonnell-2021-05-18_06:01:30-fs-wip-pdonnell-testing-20210518.025642-distro-basic-smithi/6119891/teuthology.log

#3 Updated by Radoslaw Zarzynski 7 months ago

  • Status changed from New to In Progress
  • Assignee set to Radoslaw Zarzynski

I think the reason the a race condition between exists and get of DaemonStateIndex.

The fix:

diff --git a/src/mgr/DaemonServer.cc b/src/mgr/DaemonServer.cc
index 03a5867a338..4956aaeb159 100644
--- a/src/mgr/DaemonServer.cc
+++ b/src/mgr/DaemonServer.cc
@@ -647,9 +647,8 @@ bool DaemonServer::handle_report(const ref_t<MMgrReport>& m)

     DaemonStatePtr daemon;
     // Look up the DaemonState
-    if (daemon_state.exists(key)) {
+    if (auto daemon = daemon_state.get(key); daemon != nullptr) {
       dout(20) << "updating existing DaemonState for " << key << dendl;
-      daemon = daemon_state.get(key);
     } else {
       locker.unlock();

#4 Updated by Radoslaw Zarzynski 7 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 47002

#5 Updated by Radoslaw Zarzynski 5 months ago

  • Backport set to pacific,quincy

#6 Updated by Radoslaw Zarzynski 5 months ago

  • Status changed from Fix Under Review to Pending Backport

#7 Updated by Backport Bot 5 months ago

  • Copied to Backport #57473: quincy: mgr: FAILED ceph_assert(daemon != nullptr) added

#8 Updated by Backport Bot 5 months ago

  • Copied to Backport #57474: pacific: mgr: FAILED ceph_assert(daemon != nullptr) added

#9 Updated by Backport Bot 5 months ago

  • Tags set to backport_processed

Also available in: Atom PDF