Project

General

Profile

Actions

Bug #59074

closed

OSD restarted with this error: Caught signal (Segmentation fault) in thread thread_name:bstore_kv_final

Added by Mohamed Khalil BADRI about 1 year ago. Updated about 1 year ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hello,

We recently had an OSD restarted because of the following error:
  • Caught signal (Segmentation fault)
    in thread 7f9c0cacf700 thread_name:bstore_kv_final
    ceph version 16.2.9 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)
    1: /lib64/libpthread.so.0(+0x12ce0) [0x7f9c200f0ce0]
    2: (ceph::buffer::v15_2_0::ptr::release()+0x13) [0x56238773a0a3]
    3: (BlueStore::Onode::put()+0x1b9) [0x5623873c9319]
    4: (std::_Rb_tree<boost::intrusive_ptr<BlueStore::Onode>, boost::intrusive_ptr<BlueStore::Onode>, std::_Identity<boost::intrusive_ptr<BlueStore::Onode> >, std::less<boost::intrusive_ptr<BlueStore::Onode> >, std::allocator<boost::intrusive_ptr<BlueStore::Onode> > >::_M_erase(std::_Rb_tree_node<boost::intrusive_ptr<BlueStore::Onode> >)+0x31) [0x56238747e4f1]
    5: (BlueStore::TransContext::~TransContext()+0x12f) [0x56238747e81f]
    6: (BlueStore::_txc_finish(BlueStore::TransContext
    )+0x23e) [0x5623874296be]
    7: (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x257) [0x562387435da7]
    8: (BlueStore::_kv_finalize_thread()+0x54e) [0x56238744fdce]
    9: (BlueStore::KVFinalizeThread::entry()+0x11) [0x562387483d91]
    10: /lib64/libpthread.so.0(+0x81ca) [0x7f9c200e61ca]
    11: clone()
    debug 2023-03-14T04:27:13.079+0000 7f9c0cacf700 -1
    Caught signal (Segmentation fault) *
    in thread 7f9c0cacf700 thread_name:bstore_kv_final

[root@rook-ceph-tools-848c9f67fd-tbjwz /]# ceph crash info 2023-03-14T04:27:13.080031Z_e5a2c325-b84e-45e3-b4a4-6ccd017fea3d {
"backtrace": [
"/lib64/libpthread.so.0(+0x12ce0) [0x7f9c200f0ce0]",
"(ceph::buffer::v15_2_0::ptr::release()+0x13) [0x56238773a0a3]",
"(BlueStore::Onode::put()+0x1b9) [0x5623873c9319]",
"(std::_Rb_tree<boost::intrusive_ptr<BlueStore::Onode>, boost::intrusive_ptr<BlueStore::Onode>, std::_Identity<boost::intrusive_ptr<BlueStore::Onode> >, std::less<boost::intrusive_ptr<BlueStore::Onode> >, std::allocator<boost::intrusive_ptr<BlueStore::Onode> > >::_M_erase(std::_Rb_tree_node<boost::intrusive_ptr<BlueStore::Onode> >)+0x31) [0x56238747e4f1]",
"(BlueStore::TransContext::~TransContext()+0x12f) [0x56238747e81f]",
"(BlueStore::_txc_finish(BlueStore::TransContext
)+0x23e) [0x5623874296be]",
"(BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x257) [0x562387435da7]",
"(BlueStore::_kv_finalize_thread()+0x54e) [0x56238744fdce]",
"(BlueStore::KVFinalizeThread::entry()+0x11) [0x562387483d91]",
"/lib64/libpthread.so.0(+0x81ca) [0x7f9c200e61ca]",
"clone()"
],
"ceph_version": "16.2.9",
"crash_id": "2023-03-14T04:27:13.080031Z_e5a2c325-b84e-45e3-b4a4-6ccd017fea3d",
"entity_name": "osd.12",
"os_id": "centos",
"os_name": "CentOS Stream",
"os_version": "8",
"os_version_id": "8",
"process_name": "ceph-osd",
"stack_sig": "08c6b1d32bf3f1e2ff552655037958336e58ab5a1dece7b4583d239dfa1a1e72",
"timestamp": "2023-03-14T04:27:13.080031Z",
"utsname_hostname": "rook-ceph-osd-12-77f4db4c69-6pqgk",
"utsname_machine": "x86_64",
"utsname_release": "5.4.204-113.362.amzn2.x86_64",
"utsname_sysname": "Linux",
"utsname_version": "#1 SMP Wed Jul 13 21:34:30 UTC 2022"
}

It's not the first time that an OSD restarts with this error.
Any idea about this error?

Thanks in advance.


Related issues 1 (0 open1 closed)

Is duplicate of bluestore - Bug #56382: ONode ref counting is brokenResolvedIgor Fedotov

Actions
Actions #1

Updated by Igor Fedotov about 1 year ago

  • Tracker changed from Support to Bug
  • Status changed from New to Duplicate
  • Regression set to No
  • Severity set to 3 - minor

This is a duplicate of https://tracker.ceph.com/issues/56382
Which is already fixed in the main branch, backport for Pacific is pending review (see https://github.com/ceph/ceph/pull/50072).

Actions #2

Updated by Igor Fedotov about 1 year ago

  • Is duplicate of Bug #56382: ONode ref counting is broken added
Actions #3

Updated by Ilya Dryomov about 1 year ago

  • Target version changed from v16.2.12 to v16.2.13
Actions

Also available in: Atom PDF