Project

General

Profile

Actions

Bug #54829

open

crash: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t*) const: assert(num_down_in_osds <= num_in_osds)

Added by Telemetry Bot about 2 years ago. Updated 12 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Telemetry
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):

06a12e029fd996fe040cb16b3238e36072f133b895d893d021382accb03674b6
6300ceac18516e975f96085938faf660be07f34ce2fe4eae289b8d684377ffd7
d54953bf630e685148c8bb57241b0739c17a9471813469e8cdf727aa6f75905e
0b7dd1e3d34aa970f475aeb780a21c036daade4bf521cceaee8259acf8f15270
2776c5d89450971681022397da3dbd397afd21386bde33a443d88ea0b93eec80
3b46c39bed1a76369857e89ec3348964b53c19b2585b6b67a4ef8f1931329686
da5da17121fb6317e52db10a9215d4e460a95d4fd54aabcde5a4edbadc76a193


Description

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=0bd94e8d39bae181e6968ee78ce4c954edb223e93eba71d0b1458eac90d106e6

Assert condition: num_down_in_osds <= num_in_osds
Assert function: void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t*) const

Sanitized backtrace:

    OSDMap::check_health(ceph::common::CephContext*, health_check_map_t*) const
    OSDMonitor::encode_pending(std::shared_ptr<MonitorDBStore::Transaction>)
    PaxosService::propose_pending()
    Context::complete(int)
    CommonSafeTimer<std::mutex>::timer_thread()
    CommonSafeTimerThread<std::mutex>::entry()

Crash dump sample:
{
    "archived": "2022-03-16 21:34:28.660957",
    "assert_condition": "num_down_in_osds <= num_in_osds",
    "assert_file": "osd/OSDMap.cc",
    "assert_func": "void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t*) const",
    "assert_line": 5704,
    "assert_msg": "osd/OSDMap.cc: In function 'void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t*) const' thread 7f5676582700 time 2022-03-16T21:03:16.231128+0000\nosd/OSDMap.cc: 5704: FAILED ceph_assert(num_down_in_osds <= num_in_osds)",
    "assert_thread_name": "safe_timer",
    "backtrace": [
        "/lib64/libpthread.so.0(+0x12ce0) [0x7f567f3face0]",
        "gsignal()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x1a9) [0x7f56816beba3]",
        "/usr/lib64/ceph/libceph-common.so.2(+0x276d6c) [0x7f56816bed6c]",
        "(OSDMap::check_health(ceph::common::CephContext*, health_check_map_t*) const+0x3e5e) [0x7f5681b0729e]",
        "(OSDMonitor::encode_pending(std::shared_ptr<MonitorDBStore::Transaction>)+0x3c38) [0x562c11a8c218]",
        "(PaxosService::propose_pending()+0x21a) [0x562c119fd9ea]",
        "/usr/bin/ceph-mon(+0x3fdde9) [0x562c119fdde9]",
        "(Context::complete(int)+0xd) [0x562c118ebdbd]",
        "(CommonSafeTimer<std::mutex>::timer_thread()+0x10f) [0x7f56817b395f]",
        "(CommonSafeTimerThread<std::mutex>::entry()+0x11) [0x7f56817b4cf1]",
        "/lib64/libpthread.so.0(+0x81cf) [0x7f567f3f01cf]",
        "clone()" 
    ],
    "ceph_version": "16.2.7",
    "crash_id": "2022-03-16T21:03:16.233120Z_160e0a7b-4bc1-40b0-bb91-237efdb3f894",
    "entity_name": "mon.676ff60aa64518d09da9bc045a405d9067b9815e",
    "os_id": "centos",
    "os_name": "CentOS Stream",
    "os_version": "8",
    "os_version_id": "8",
    "process_name": "ceph-mon",
    "stack_sig": "06a12e029fd996fe040cb16b3238e36072f133b895d893d021382accb03674b6",
    "timestamp": "2022-03-16T21:03:16.233120Z",
    "utsname_machine": "x86_64",
    "utsname_release": "5.15.0-0.bpo.3-amd64",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP Debian 5.15.15-2~bpo11+1 (2022-02-03)" 
}

Actions #1

Updated by Telemetry Bot about 2 years ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v16.2.7 added
Actions #2

Updated by Telemetry Bot almost 2 years ago

  • Crash signature (v1) updated (diff)
  • Affected Versions v16.2.9, v17.2.0 added
Actions #3

Updated by Telemetry Bot almost 2 years ago

  • Crash signature (v1) updated (diff)
  • Affected Versions v17.2.1 added
Actions #4

Updated by Laura Flores about 1 year ago

  • Crash signature (v1) updated (diff)

/a/yuriw-2023-01-27_16:33:50-rados-wip-yuri2-testing-2023-01-26-1532-distro-default-smithi/7142297

{
    "crash_id": "2023-01-28T12:21:19.773443Z_2faf40cf-20e8-45e6-976c-ff2117575c1c",
    "timestamp": "2023-01-28T12:21:19.773443Z",
    "process_name": "ceph-mon",
    "entity_name": "mon.a",
    "ceph_version": "18.0.0-2064-g161d7183",
    "utsname_hostname": "smithi123",
    "utsname_sysname": "Linux",
    "utsname_release": "4.18.0-448.el8.x86_64",
    "utsname_version": "#1 SMP Wed Jan 18 15:02:46 UTC 2023",
    "utsname_machine": "x86_64",
    "os_name": "CentOS Stream",
    "os_id": "centos",
    "os_version_id": "8",
    "os_version": "8",
    "assert_condition": "num_down_in_osds <= num_in_osds",
    "assert_func": "void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t*) const",
    "assert_file": "/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-2064-g161d7183/rpm/el8/BUILD/ceph-18.0.0-2064-g161d7183/src/osd/OSDMap.cc",
    "assert_line": 6147,
    "assert_thread_name": "safe_timer",
    "assert_msg": "/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-2064-g161d7183/rpm/el8/BUILD/ceph-18.0.0-2064-g161d7183/src/osd/OSDMap.cc: In function 'void OSDMap::check_health(ceph::common::CephContext*, health_check_map_t*) const' thread 14542700 time 2023-01-28T12:21:19.705383+0000\n/home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/18.0.0-2064-g161d7183/rpm/el8/BUILD/ceph-18.0.0-2064-g161d7183/src/osd/OSDMap.cc: 6147: FAILED ceph_assert(num_down_in_osds <= num_in_osds)\n",
    "backtrace": [
        "/lib64/libpthread.so.0(+0x12cf0) [0x7dd1cf0]",
        "gsignal()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x18f) [0x50eee15]",
        "/usr/lib64/ceph/libceph-common.so.2(+0x2a7f81) [0x50eef81]",
        "(OSDMap::check_health(ceph::common::CephContext*, health_check_map_t*) const+0x3e88) [0x55b40b8]",
        "(OSDMonitor::encode_pending(std::shared_ptr<MonitorDBStore::Transaction>)+0x4406) [0x5d6f56]",
        "(PaxosService::propose_pending()+0x120) [0x563e70]",
        "(Context::complete(int)+0xd) [0x419f3d]",
        "(CommonSafeTimer<std::mutex>::timer_thread()+0x127) [0x51df727]",
        "(CommonSafeTimerThread<std::mutex>::entry()+0x11) [0x51e0851]",
        "/lib64/libpthread.so.0(+0x81ca) [0x7dc71ca]",
        "clone()" 
    ]
}
Actions #5

Updated by Telemetry Bot 12 months ago

  • Crash signature (v1) updated (diff)
  • Affected Versions v17.2.5 added
Actions

Also available in: Atom PDF