Project

General

Profile

Bug #56896

crash: int OSD::shutdown(): assert(end_time - start_time_func < cct->_conf->osd_fast_shutdown_timeout)

Added by Telemetry Bot over 1 year ago. Updated 10 months ago.

Status:
New
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Telemetry
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):

2af5542c6650dea698ae7186f067d3beafbf3bd30a36d7d19c304f29ad9b5802


Description

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=50bf2266e28cc1764b47775be5a14240d2473e019eebac05627b7efb4c77ec8d

Assert condition: end_time - start_time_func < cct->_conf->osd_fast_shutdown_timeout
Assert function: int OSD::shutdown()

Sanitized backtrace:

    pthread_kill()
    raise()
    OSD::shutdown()
    SignalHandler::entry()

Crash dump sample:
{
    "assert_condition": "end_time - start_time_func < cct->_conf->osd_fast_shutdown_timeout",
    "assert_file": "osd/OSD.cc",
    "assert_func": "int OSD::shutdown()",
    "assert_line": 4323,
    "assert_msg": "osd/OSD.cc: In function 'int OSD::shutdown()' thread 7fb8e1d6b640 time 2022-07-19T04:46:25.632566-0400\nosd/OSD.cc: 4323: FAILED ceph_assert(end_time - start_time_func < cct->_conf->osd_fast_shutdown_timeout)",
    "assert_thread_name": "signal_handler",
    "backtrace": [
        "/lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7fb8e6474520]",
        "pthread_kill()",
        "raise()",
        "abort()",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x182) [0x5624886557f3]",
        "/usr/bin/ceph-osd(+0x58c955) [0x562488655955]",
        "(OSD::shutdown()+0x1678) [0x562488799228]",
        "(SignalHandler::entry()+0x62d) [0x562488ddeb1d]",
        "/lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7fb8e64c6b43]",
        "/lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7fb8e6558a00]" 
    ],
    "ceph_version": "17.2.0",
    "crash_id": "2022-07-19T08:46:25.642368Z_707f1918-6002-4cf2-91c4-3eabf43e2ad6",
    "entity_name": "osd.92814b044f442681826ea0cbcccfd81e98868fb3",
    "os_id": "22.04",
    "os_name": "Ubuntu 22.04 LTS",
    "os_version": "22.04 LTS (Jammy Jellyfish)",
    "os_version_id": "22.04",
    "process_name": "ceph-osd",
    "stack_sig": "2af5542c6650dea698ae7186f067d3beafbf3bd30a36d7d19c304f29ad9b5802",
    "timestamp": "2022-07-19T08:46:25.642368Z",
    "utsname_machine": "x86_64",
    "utsname_release": "5.15.0-37-generic",
    "utsname_sysname": "Linux",
    "utsname_version": "#39-Ubuntu SMP Wed Jun 1 19:16:45 UTC 2022" 
}


Related issues

Duplicates RADOS - Bug #61140: crash: int OSD::shutdown(): assert(end_time - start_time_func < cct->_conf->osd_fast_shutdown_timeout) Pending Backport

History

#1 Updated by Telemetry Bot over 1 year ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v17.2.0 added

#2 Updated by Nikola Ciprich 12 months ago

Hi, just reporting, that I just hit this problem on few OSDs in 17.2.5

#3 Updated by Igor Fedotov 10 months ago

  • Assignee set to Radoslaw Zarzynski

Looking at the OSD code I don't see much sense behind this assertion and the relevant timeout parameter.
Shouldn't we just remove this piece of code?
Why do we crash OSD if it's unable to shutdown in 15 secs? What's the magic with this timeout?
@Radek - what do you think?

#4 Updated by Telemetry Bot 10 months ago

  • Affected Versions v17.2.5 added

#5 Updated by Matan Breizman 8 months ago

  • Duplicates Bug #61140: crash: int OSD::shutdown(): assert(end_time - start_time_func < cct->_conf->osd_fast_shutdown_timeout) added

Also available in: Atom PDF