Project

General

Profile

Actions

Bug #62316

open

ceph-osd crash

Added by guangxing hu 9 months ago. Updated 9 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Actions #1

Updated by guangxing hu 9 months ago

ceph-osd crash in ceph_assert(read_len < std::numeric_limits<int>::max())

Actions #2

Updated by guangxing hu 9 months ago

{
"crash_id": "2023-07-19_23:43:38.778427Z_a556b0c2-b357-4902-9296-a9de60dcbd85",
"timestamp": "2023-07-19 23:43:38.778427Z",
"process_name": "ceph-osd",
"entity_name": "osd.60",
"ceph_version": "14.2.8-51",
"utsname_hostname": "HOST1",
"utsname_sysname": "Linux",
"utsname_release": "4.19.25-206.el7_6.bclinux.x86_64",
"utsname_version": "#1 SMP Thu Sep 9 16:37:38 CST 2021",
"utsname_machine": "x86_64",
"os_name": "BigCloud Enterprise Linux For LDK",
"os_id": "bclinux",
"os_version_id": "7",
"os_version": "7 (Core)",
"assert_condition": "read_len < std::numeric_limits<int>::max()",
"assert_func": "Ct<ProtocolV1>* ProtocolV1::handle_message_data(char*, int)",
"assert_file": "/var/lib/jenkins/workspace/rpmbuild/BUILD/ceph-14.2.8/src/msg/async/ProtocolV1.cc",
"assert_line": 865,
"assert_thread_name": "msgr-worker-1",
"assert_msg": "/var/lib/jenkins/workspace/rpmbuild/BUILD/ceph-14.2.8/src/msg/async/ProtocolV1.cc: In function 'Ct<ProtocolV1>* ProtocolV1::handle_message_data(char*, int)' thread 7f4d61865700 time 2023-07-20 07:43:38.772797\n/var/lib/jenkins/workspace/rpmbuild/BUILD/ceph-14.2.8/src/msg/async/ProtocolV1.cc: 865: FAILED ceph_assert(read_len < std::numeric_limits<int>::max())\n",
"backtrace": [
"(()+0xf5d0) [0x7f4d657925d0]",
"(gsignal()+0x37) [0x7f4d64587207]",
"(abort()+0x148) [0x7f4d645888f8]",
"(ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x199) [0x5636288fcabe]",
"(()+0x4cac37) [0x5636288fcc37]",
"(ProtocolV1::handle_message_data(char*, int)+0x28b) [0x5636292e0e8b]",
"(()+0xea891d) [0x5636292da91d]",
"(AsyncConnection::process()+0x186) [0x5636292d7756]",
"(EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0xa15) [0x563629125365]",
"(()+0xcf7ac7) [0x563629129ac7]",
"(()+0x11cb83f) [0x5636295fd83f]",
"(()+0x7dd5) [0x7f4d6578add5]",
"(clone()+0x6d) [0x7f4d6464eead]"
]
}

Actions #3

Updated by guangxing hu 9 months ago

before crash,i found tcmalloc large alloc

Jul 20 01:53:44 HOST1 ceph-osd: tcmalloc: large alloc 2414796800 bytes == 0x55ee56114000 @ 0x7f8f9683b4ef 0x7f8f9685c010 0x55ecd892919b 0x55ecd89293c2 0x55ecd8929d05 0x55ecd8acdeb0 0x55ecd8ace028 0x55ecd8ace807 0x55ecd8ac891d 0x55ecd8ac5756 0x55ecd8913365 0x55ecd8917ac7 0x55ecd8deb83f 0x7f8f9449fdd5 0x7f8f93363ead

Actions #4

Updated by guangxing hu 9 months ago

tcmalloc: large alloc, after 4 seconds,osd crash in ceph_assert(read_len < std::numeric_limits<int>::max())

Actions

Also available in: Atom PDF