Project

General

Profile

Actions

Bug #54700

open

crash: DaemonServer::got_service_map()::<lambda(const ServiceMap&)>: assert(pending_service_map.epoch > service_map.epoch)

Added by Telemetry Bot about 2 years ago. Updated almost 2 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Telemetry
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):

a0f613f0f35695b63b31a8ed0fc2e54edbbc18a34467a7f2afa9fa9e4575e51d
ce2fb2fd023ab69c72c793405ea59eec5224cfc42910d743b735fdc27a1748d0


Description

New crash events were reported via Telemetry with newer versions (['16.2.6', '16.2.7']) than encountered in Tracker (16.2.0).

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=199402a5332a2f66405181b47f3c72ea87c36f95026fe56c4965d27158a0f81f

Assert condition: pending_service_map.epoch > service_map.epoch
Assert function: DaemonServer::got_service_map()::<lambda(const ServiceMap&)>

Sanitized backtrace:

    DaemonServer::got_service_map()
    Mgr::ms_dispatch2(boost::intrusive_ptr<Message> const&)
    MgrStandby::ms_dispatch2(boost::intrusive_ptr<Message> const&)
    DispatchQueue::entry()
    DispatchQueue::DispatchThread::entry()

Crash dump sample:
{
    "assert_condition": "pending_service_map.epoch > service_map.epoch",
    "assert_file": "mgr/DaemonServer.cc",
    "assert_func": "DaemonServer::got_service_map()::<lambda(const ServiceMap&)>",
    "assert_line": 2934,
    "assert_msg": "mgr/DaemonServer.cc: In function 'DaemonServer::got_service_map()::<lambda(const ServiceMap&)>' thread 7f118f0b6700 time 2022-03-01T14:14:51.700558+0100\nmgr/DaemonServer.cc: 2934: FAILED ceph_assert(pending_service_map.epoch > service_map.epoch)",
    "assert_thread_name": "ms_dispatch",
    "backtrace": [
        "/lib/x86_64-linux-gnu/libpthread.so.0(+0x14140) [0x7f1194b2d140]",
        "gsignal()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x16e) [0x7f1194f90090]",
        "/usr/lib/ceph/libceph-common.so.2(+0x2511d1) [0x7f1194f901d1]",
        "(DaemonServer::got_service_map()+0xb92) [0x55823e4d4652]",
        "(Mgr::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x19b) [0x55823e51999b]",
        "(MgrStandby::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0xac) [0x55823e52683c]",
        "(Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0x468) [0x7f11951be478]",
        "(DispatchQueue::entry()+0x5ef) [0x7f11951bbb7f]",
        "(DispatchQueue::DispatchThread::entry()+0xd) [0x7f119527a9bd]",
        "/lib/x86_64-linux-gnu/libpthread.so.0(+0x8ea7) [0x7f1194b21ea7]",
        "clone()" 
    ],
    "ceph_version": "16.2.7",
    "crash_id": "2022-03-01T13:14:51.743713Z_5b27d8b8-b81c-4042-9dc5-e9cd7a5a24ed",
    "entity_name": "mgr.ef482b661122aaf7ed735bb9351e769cc42b0aa5",
    "os_id": "11",
    "os_name": "Debian GNU/Linux 11 (bullseye)",
    "os_version": "11 (bullseye)",
    "os_version_id": "11",
    "process_name": "ceph-mgr",
    "stack_sig": "a0f613f0f35695b63b31a8ed0fc2e54edbbc18a34467a7f2afa9fa9e4575e51d",
    "timestamp": "2022-03-01T13:14:51.743713Z",
    "utsname_machine": "x86_64",
    "utsname_release": "5.13.19-4-pve",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP PVE 5.13.19-9 (Mon, 07 Feb 2022 11:01:14 +0100)" 
}


Files

api_test_failure.txt.gz (175 KB) api_test_failure.txt.gz Laura Flores, 04/25/2022 07:48 PM

Related issues 4 (0 open4 closed)

Related to mgr - Bug #51929: crash: DaemonServer::got_service_map()::<lambda(const ServiceMap&)>: assert(pending_service_map.epoch > service_map.epoch)Duplicate

Actions
Related to mgr - Bug #48022: mgr/DaemonServer.cc: FAILED ceph_assert(pending_service_map.epoch > service_map.epoch)Resolved

Actions
Related to mgr - Backport #49908: pacific: mgr/DaemonServer.cc: FAILED ceph_assert(pending_service_map.epoch > service_map.epoch)ResolvedNeha OjhaActions
Related to mgr - Backport #53198: octopus: mgr/DaemonServer.cc: FAILED ceph_assert(pending_service_map.epoch > service_map.epoch)ResolvedActions
Actions #1

Updated by Telemetry Bot about 2 years ago

  • Related to Bug #51929: crash: DaemonServer::got_service_map()::<lambda(const ServiceMap&)>: assert(pending_service_map.epoch > service_map.epoch) added
Actions #2

Updated by Telemetry Bot about 2 years ago

  • Related to Bug #48022: mgr/DaemonServer.cc: FAILED ceph_assert(pending_service_map.epoch > service_map.epoch) added
Actions #3

Updated by Telemetry Bot about 2 years ago

  • Related to Backport #49908: pacific: mgr/DaemonServer.cc: FAILED ceph_assert(pending_service_map.epoch > service_map.epoch) added
Actions #4

Updated by Telemetry Bot about 2 years ago

  • Related to Backport #53198: octopus: mgr/DaemonServer.cc: FAILED ceph_assert(pending_service_map.epoch > service_map.epoch) added
Actions #5

Updated by Telemetry Bot about 2 years ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v15.2.13, v16.2.6, v16.2.7 added
Actions #6

Updated by Laura Flores almost 2 years ago

Caught in a jenkins api test: https://jenkins.ceph.com/job/ceph-api/35612/console

2022-04-25T19:16:08.725+0000 7f98d0734700 -1 ../src/mgr/DaemonServer.cc: In function 'DaemonServer::got_service_map()::<lambda(const ServiceMap&)>' thread 7f98d0734700 time 2022-04-25T19:16:08.724479+0000
../src/mgr/DaemonServer.cc: 2992: FAILED ceph_assert(pending_service_map.epoch > service_map.epoch)

 ceph version Development (no_version) quincy (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x127) [0x7f98d6ba006e]
 2: (ceph::register_assert_context(ceph::common::CephContext*)+0) [0x7f98d6ba0299]
 3: ./bin/./ceph-mgr(+0x2d5524) [0x55abd3088524]
 4: (DaemonServer::got_service_map()+0x80) [0x55abd309c112]
 5: (Mgr::handle_service_map(boost::intrusive_ptr<MServiceMap>)+0x29f) [0x55abd30d54db]
 6: (Mgr::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x265) [0x55abd30d57b1]
 7: (MgrStandby::ms_dispatch2(boost::intrusive_ptr<Message> const&)+0x401) [0x55abd30df3df]
 8: (Messenger::ms_deliver_dispatch(boost::intrusive_ptr<Message> const&)+0xba) [0x7f98d6d175ca]
 9: (DispatchQueue::entry()+0xba6) [0x7f98d6d14c66]
 10: (DispatchQueue::DispatchThread::entry()+0x11) [0x7f98d6df39d1]
 11: (Thread::entry_wrapper()+0x43) [0x7f98d6b71a75]
 12: (Thread::_entry_func(void*)+0xd) [0x7f98d6b71a91]
 13: /lib/x86_64-linux-gnu/libpthread.so.0(+0x9609) [0x7f98d6531609]
 14: clone()

See attached text file if the link above is expired.

Actions #7

Updated by Neha Ojha almost 2 years ago

Laura Flores wrote:

Caught in a jenkins api test: https://jenkins.ceph.com/job/ceph-api/35612/console
[...]

See attached text file if the link above is expired.

This is probably a dup of https://tracker.ceph.com/issues/51835

Actions #8

Updated by Laura Flores almost 2 years ago

Thanks Neha, I was trying to find this one!

Actions

Also available in: Atom PDF