Project

General

Profile

Actions

Bug #61416

open

memstore segmentation fault crash

Added by Lucian Petrut 11 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

The Windows CI job is currently using memstore (we're trying to put significant load on the client in a relatively short timeframe in order to uncover potential issues).

During ~15% of the jobs, one OSD fails with a segmentation fault:

2023-05-24T21:06:02.557+0000 7f0dccc13640 -1 *** Caught signal (Segmentation fault) **
 in thread 7f0dccc13640 thread_name:tp_osd_tp

 ceph version c0ab92 (cc0ab92e460cf464849b1a30190e7886d60732ea) reef (dev)
 1: /lib/x86_64-linux-gnu/libc.so.6(+0x42520) [0x7f0de684c520]
 2: (ceph::buffer::v15_2_0::ptr::ptr(ceph::buffer::v15_2_0::ptr const&)+0x17) [0x5583d3401dc7]
 3: (ceph::buffer::v15_2_0::ptr_node::cloner::operator()(ceph::buffer::v15_2_0::ptr_node const&)+0x2d) [0x5583d34040ad]
 4: (MemStore::OmapIteratorImpl::value()+0x83) [0x5583d2fa6d93]
 5: (OSDriver::get_next(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list>*)+0x53a) [0x5583d2d72cea]
 6: (SnapMapper::get_next_objects_to_trim(snapid_t, unsigned int, std::vector<hobject_t, std::allocator<hobject_t> >*)+0x68c) [0x5583d2d7ab4c]
 7: (PrimaryLogPG::AwaitAsyncWork::react(PrimaryLogPG::DoSnapWork const&)+0x201) [0x5583d2c57881]
 8: (boost::statechart::simple_state<PrimaryLogPG::AwaitAsyncWork, PrimaryLogPG::Trimming, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, (boost::statechart::history_mode)0>::react_impl(boost::statechart::event_base const&, void const*)+0x65) [0x5583d2cdc315]
 9: (boost::statechart::state_machine<PrimaryLogPG::SnapTrimmer, PrimaryLogPG::NotTrimming, std::allocator<boost::statechart::none>, boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base const&)+0x5a) [0x5583d2cacafa]
 10: (PrimaryLogPG::snap_trimmer(unsigned int)+0xe3) [0x5583d2c0d643]
 11: (ceph::osd::scheduler::PGSnapTrim::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x1b) [0x5583d2deeefb]
 12: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x618) [0x5583d2ae46b8]
 13: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x40b) [0x5583d31cb2bb]
 14: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x5583d31ce494]
 15: (Thread::entry_wrapper()+0x54) [0x5583d31b8104]
 16: /lib/x86_64-linux-gnu/libc.so.6(+0x94b43) [0x7f0de689eb43]
 17: /lib/x86_64-linux-gnu/libc.so.6(+0x126a00) [0x7f0de6930a00]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

This seems to be a memstore bug. Note that the OSDs are not running on Windows, only the client bits have been ported to Windows.

No data to display

Actions

Also available in: Atom PDF