Project

General

Profile

Actions

Bug #51887

open

crash: int Infiniband::MemoryManager::Cluster::fill(uint32_t): assert(m)

Added by Telemetry Bot over 2 years ago. Updated about 2 years ago.

Status:
Triaged
Priority:
Low
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Telemetry
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):

33fdcea66d9494023cab7cacb0df26c3001b94d44d5c95bab5239fb11d66b2c3
4f70e5a6707159820b2b8da3fcca4d263f45a3f9eb1bdef54bb1545558ee1cb3
766be26bfe831e0682d2048a8dc960defeda12d09e3023e77c277e519784b73b
a594420536d5797db898deac34687a997b2ef5e8914abdb809f37169e5214cb7
aeeb799c9ee05a227eb75f0cb7663cda11a7b7f979e8b6bd736fd7c967d3fd0c
e9a587500cd0d3a0ca3003144680586ba4656cdec011bfd2a91e3f3334bfa213
f318bb7dd5ed05f869f475d1e04f0f440e8297d0dbc2b4f00d44d37b5b941c71
f3fb431dbb2a3e2f5deb2623385e195de66d7c87b1ffa8174b69244c6e937285
f514f2bcfff5e613764a936508a85d6b0f61441266b5bc22f710f2a95eabe04d
f665a1ed57f0db8ce90b2de73edc1fcabb16d36f4af7dbb424d1f9ebacfcd2d0


Description

http://telemetry.front.sepia.ceph.com:4000/d/jByk5HaMz/crash-spec-x-ray?orgId=1&var-sig_v2=f3fb431dbb2a3e2f5deb2623385e195de66d7c87b1ffa8174b69244c6e937285

Assert condition: m
Assert function: int Infiniband::MemoryManager::Cluster::fill(uint32_t)

Sanitized backtrace:

    Infiniband::MemoryManager::Cluster::fill(unsigned int)
    Infiniband::init()
    RDMAWorker::listen(entity_addr_t&, unsigned int, SocketOptions const&, ServerSocket*)
    EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)
    clone()

Crash dump sample:
{
    "archived": "2021-07-19 00:58:58.362933",
    "assert_condition": "m",
    "assert_file": "msg/async/rdma/Infiniband.cc",
    "assert_func": "int Infiniband::MemoryManager::Cluster::fill(uint32_t)",
    "assert_line": 783,
    "assert_msg": "msg/async/rdma/Infiniband.cc: In function 'int Infiniband::MemoryManager::Cluster::fill(uint32_t)' thread 7f2a1344e700 time 2021-07-18T16:39:19.997331-0700\nmsg/async/rdma/Infiniband.cc: 783: FAILED ceph_assert(m)",
    "assert_thread_name": "msgr-worker-0",
    "backtrace": [
        "(()+0x14140) [0x7f2a174bc140]",
        "(gsignal()+0x141) [0x7f2a16f87ce1]",
        "(abort()+0x123) [0x7f2a16f71537]",
        "(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x17b) [0x5635e31510b7]",
        "(()+0x9d01f8) [0x5635e31511f8]",
        "(Infiniband::MemoryManager::Cluster::fill(unsigned int)+0x20b) [0x5635e3be951b]",
        "(Infiniband::init()+0x268) [0x5635e3bed918]",
        "(RDMAWorker::listen(entity_addr_t&, unsigned int, SocketOptions const&, ServerSocket*)+0x2c) [0x5635e39eedcc]",
        "(()+0x12521de) [0x5635e39d31de]",
        "(EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x718) [0x5635e39e33a8]",
        "(()+0x1267c4a) [0x5635e39e8c4a]",
        "(()+0xceed0) [0x7f2a1733fed0]",
        "(()+0x8ea7) [0x7f2a174b0ea7]",
        "(clone()+0x3f) [0x7f2a17049def]" 
    ],
    "ceph_version": "15.2.13",
    "crash_id": "2021-07-18T23:39:20.007043Z_ecfc3d02-34ca-40d9-9650-a1993ec4b695",
    "entity_name": "osd.9c4c80998239a4192a95e12dab71b0c04c901cf1",
    "os_id": "11",
    "os_name": "Debian GNU/Linux 11 (bullseye)",
    "os_version": "11 (bullseye)",
    "os_version_id": "11",
    "process_name": "ceph-osd",
    "stack_sig": "33fdcea66d9494023cab7cacb0df26c3001b94d44d5c95bab5239fb11d66b2c3",
    "timestamp": "2021-07-18T23:39:20.007043Z",
    "utsname_machine": "x86_64",
    "utsname_release": "5.11.22-2-pve",
    "utsname_sysname": "Linux",
    "utsname_version": "#1 SMP PVE 5.11.22-3 (Sun, 11 Jul 2021 13:45:15 +0200)" 
}

Actions #1

Updated by Telemetry Bot over 2 years ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
  • Affected Versions v14.2.9, v15.2.13, v15.2.3, v15.2.4, v15.2.5, v15.2.6, v15.2.7, v15.2.9 added
Actions #2

Updated by Neha Ojha over 2 years ago

  • Status changed from New to Triaged
  • Priority changed from Normal to Low
  • Crash signature (v1) updated (diff)

This crash has occurred in 3 clusters multiple times and because Infiniband is low priority.

Actions #3

Updated by Telemetry Bot about 2 years ago

  • Crash signature (v1) updated (diff)
  • Crash signature (v2) updated (diff)
Actions

Also available in: Atom PDF