Bug #53002
closedcrash BlueStore::Onode::put from BlueStore::TransContext::~TransContext
0%
Description
We've just seen this crash in the wild running 15.2.14. Maybe a dup of #50788?
-14> 2021-10-21T09:42:31.079+0200 7f88e1b2c700 5 prioritycache tune_memory target: 3221225472 mapped: 3201368064 unmapped: 466845696 heap: 3668213760 old mem: 1932735267 new mem: 19327352 67 -13> 2021-10-21T09:42:31.924+0200 7f88dde53700 10 monclient: tick -12> 2021-10-21T09:42:31.924+0200 7f88dde53700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2021-10-21T09:42:01.925680+0200) -11> 2021-10-21T09:42:32.062+0200 7f88dd323700 5 bluestore(/var/lib/ceph/osd/ceph-138) _kv_sync_thread utilization: idle 9.861052482s of 10.001157459s, submitted: 477 -10> 2021-10-21T09:42:32.080+0200 7f88e1b2c700 5 prioritycache tune_memory target: 3221225472 mapped: 3201417216 unmapped: 466796544 heap: 3668213760 old mem: 1932735267 new mem: 19327352 67 -9> 2021-10-21T09:42:32.080+0200 7f88e1b2c700 5 bluestore.MempoolThread(0x55a9f3e04a08) _resize_shards cache_size: 1932735267 kv_alloc: 889192448 kv_used: 586783984 meta_alloc: 813694976 meta_used: 511074366 data_alloc: 218103808 data_used: 0 -8> 2021-10-21T09:42:32.115+0200 7f88cd509700 0 <cls> /builddir/build/BUILD/ceph-15.2.14/src/cls/lock/cls_lock.cc:290: Could not read list of current lockers off disk: (2) No such file o r directory -7> 2021-10-21T09:42:32.925+0200 7f88dde53700 10 monclient: tick -6> 2021-10-21T09:42:32.925+0200 7f88dde53700 10 monclient: _check_auth_rotating have uptodate secrets (they expire after 2021-10-21T09:42:02.925792+0200) -5> 2021-10-21T09:42:33.082+0200 7f88e1b2c700 5 prioritycache tune_memory target: 3221225472 mapped: 3201490944 unmapped: 466722816 heap: 3668213760 old mem: 1932735267 new mem: 19327352 67 -4> 2021-10-21T09:42:33.111+0200 7f88c9501700 0 <cls> /builddir/build/BUILD/ceph-15.2.14/src/cls/lock/cls_lock.cc:290: Could not read list of current lockers off disk: (2) No such file o r directory -3> 2021-10-21T09:42:33.206+0200 7f88c8d00700 5 osd.138 360301 heartbeat osd_stat(store_statfs(0xa9ee097000/0x193950000/0xdf90000000, data 0x3408e34bb0/0x340e617000, compress 0x0/0x0/0x0 , omap 0x2dc5721e, meta 0x165cf8de2), peers [1,2,3,12,16,21,23,24,27,29,34,35,41,42,45,49,52,55,63,68,70,71,72,77,79,82,83,85,105,108,113,119,124,131,133,137,139,149,150,152,156,161,167,170,1 75,180,206,211,212,213,217,236,240,245,247,250,252,259,265,269,272,273,274,275,277,280,287] op hist []) -2> 2021-10-21T09:42:33.367+0200 7f88cd509700 0 <cls> /builddir/build/BUILD/ceph-15.2.14/src/cls/lock/cls_lock.cc:290: Could not read list of current lockers off disk: (2) No such file o r directory -1> 2021-10-21T09:42:33.440+0200 7f88cc507700 0 <cls> /builddir/build/BUILD/ceph-15.2.14/src/cls/lock/cls_lock.cc:290: Could not read list of current lockers off disk: (2) No such file o r directory 0> 2021-10-21T09:42:33.457+0200 7f88e232d700 -1 *** Caught signal (Segmentation fault) ** in thread 7f88e232d700 thread_name:bstore_kv_final ceph version 15.2.14-7 (cd3bb7e87a2f62c1b862ff3fd8b1eec13391a5be) octopus (stable) 1: (()+0xf630) [0x7f88f0f8f630] 2: (BlueStore::Onode::put()+0x2eb) [0x55a9e87de1fb] 3: (std::_Rb_tree<boost::intrusive_ptr<BlueStore::Onode>, boost::intrusive_ptr<BlueStore::Onode>, std::_Identity<boost::intrusive_ptr<BlueStore::Onode> >, std::less<boost::intrusive_ptr<BlueStore::Onode> >, std::allocator<boost::intrusive_ptr<BlueStore::Onode> > >::_M_erase(std::_Rb_tree_node<boost::intrusive_ptr<BlueStore::Onode> >*)+0x2d) [0x55a9e888297d] 4: (BlueStore::TransContext::~TransContext()+0x107) [0x55a9e8882aa7] 5: (BlueStore::_txc_finish(BlueStore::TransContext*)+0x231) [0x55a9e8854041] 6: (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x1fc) [0x55a9e8854b7c] 7: (BlueStore::_kv_finalize_thread()+0x552) [0x55a9e8857a52] 8: (BlueStore::KVFinalizeThread::entry()+0xd) [0x55a9e8887edd] 9: (()+0x7ea5) [0x7f88f0f87ea5] 10: (clone()+0x6d) [0x7f88efe4a9fd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
The fsck is clean:
# ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-138/ fsck success
We have the coredump and could check anything...
(gdb) bt #0 0x00007f88f0f8f4fb in raise () from /lib64/libpthread.so.0 #1 0x000055a9e89501b2 in reraise_fatal (signum=11) at /usr/src/debug/ceph-15.2.14/src/global/signal_handler.cc:326 #2 handle_fatal_signal(int) () at /usr/src/debug/ceph-15.2.14/src/global/signal_handler.cc:326 #3 <signal handler called> #4 0x000055a9e87de1fb in lock (this=<optimized out>) at /opt/rh/devtoolset-8/root/usr/include/c++/8/mutex:110 #5 BlueStore::Onode::put (this=0x55aa7ea2b440) at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.cc:3588 #6 0x000055a9e888297d in intrusive_ptr_release (o=<optimized out>) at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.h:3370 #7 ~intrusive_ptr (this=0x55aa49c74c20, __in_chrg=<optimized out>) at /usr/src/debug/ceph-15.2.14/build/boost/include/boost/smart_ptr/intrusive_ptr.hpp:98 #8 destroy<boost::intrusive_ptr<BlueStore::Onode> > (this=0x55aa8e89d578, __p=0x55aa49c74c20) at /opt/rh/devtoolset-8/root/usr/include/c++/8/ext/new_allocator.h:140 #9 destroy<boost::intrusive_ptr<BlueStore::Onode> > (__a=..., __p=0x55aa49c74c20) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/alloc_traits.h:487 #10 _M_destroy_node (this=0x55aa8e89d578, __p=0x55aa49c74c00) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_tree.h:661 #11 _M_drop_node (this=0x55aa8e89d578, __p=0x55aa49c74c00) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_tree.h:669 #12 std::_Rb_tree<boost::intrusive_ptr<BlueStore::Onode>, boost::intrusive_ptr<BlueStore::Onode>, std::_Identity<boost::intrusive_ptr<BlueStore::Onode> >, std::less<boost::intrusive_ptr<BlueStore::Onode> >, std::allocator<boost::intrusive_ptr<BlueStore::Onode> > >::_M_erase ( this=this@entry=0x55aa8e89d578, __x=0x55aa49c74c00) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_tree.h:1874 #13 0x000055a9e8882aa7 in ~_Rb_tree (this=0x55aa8e89d578, __in_chrg=<optimized out>) at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.h:1595 #14 ~set (this=0x55aa8e89d578, __in_chrg=<optimized out>) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_set.h:281 #15 BlueStore::TransContext::~TransContext (this=0x55aa8e89d500, __in_chrg=<optimized out>) at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.h:1594 #16 0x000055a9e8854041 in ~TransContext (this=0x55aa8e89d500, __in_chrg=<optimized out>) at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.cc:11993 #17 BlueStore::_txc_finish(BlueStore::TransContext*) () at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.cc:11993 #18 0x000055a9e8854b7c in BlueStore::_txc_state_proc(BlueStore::TransContext*) () at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.cc:11709 #19 0x000055a9e8857a52 in BlueStore::_kv_finalize_thread() () at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.cc:12556 #20 0x000055a9e8887edd in BlueStore::KVFinalizeThread::entry (this=<optimized out>) at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.h:1912 #21 0x00007f88f0f87ea5 in start_thread () from /lib64/libpthread.so.0 #22 0x00007f88efe4a9fd in clone () from /lib64/libc.so.6 (gdb)
(gdb) up #1 0x000055a9e89501b2 in reraise_fatal (signum=11) at /usr/src/debug/ceph-15.2.14/src/global/signal_handler.cc:326 326 reraise_fatal(signum); (gdb) up #2 handle_fatal_signal(int) () at /usr/src/debug/ceph-15.2.14/src/global/signal_handler.cc:326 326 reraise_fatal(signum); (gdb) up #3 <signal handler called> (gdb) up #4 0x000055a9e87de1fb in lock (this=<optimized out>) at /opt/rh/devtoolset-8/root/usr/include/c++/8/mutex:110 110 /opt/rh/devtoolset-8/root/usr/include/c++/8/mutex: No such file or directory. (gdb) up #5 BlueStore::Onode::put (this=0x55aa7ea2b440) at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.cc:3588 3588 ocs->lock.lock(); (gdb) list 3583 ocs->lock.lock(); 3584 // It is possible that during waiting split_cache moved us to different OnodeCacheShard. 3585 while (ocs != c->get_onode_cache()) { 3586 ocs->lock.unlock(); 3587 ocs = c->get_onode_cache(); 3588 ocs->lock.lock(); 3589 } 3590 bool need_unpin = pinned; 3591 pinned = pinned && nref > 2; // intentionally use > not >= as we have 3592 // +1 due to pinned state (gdb) up #6 0x000055a9e888297d in intrusive_ptr_release (o=<optimized out>) at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.h:3370 3370 o->put(); (gdb) list 3365 3366 static inline void intrusive_ptr_add_ref(BlueStore::Onode *o) { 3367 o->get(); 3368 } 3369 static inline void intrusive_ptr_release(BlueStore::Onode *o) { 3370 o->put(); 3371 } 3372 3373 static inline void intrusive_ptr_add_ref(BlueStore::OpSequencer *o) { 3374 o->get(); (gdb) up #7 ~intrusive_ptr (this=0x55aa49c74c20, __in_chrg=<optimized out>) at /usr/src/debug/ceph-15.2.14/build/boost/include/boost/smart_ptr/intrusive_ptr.hpp:98 98 if( px != 0 ) intrusive_ptr_release( px ); (gdb) list 93 if( px != 0 ) intrusive_ptr_add_ref( px ); 94 } 95 96 ~intrusive_ptr() 97 { 98 if( px != 0 ) intrusive_ptr_release( px ); 99 } 100 101 #if !defined(BOOST_NO_MEMBER_TEMPLATES) || defined(BOOST_MSVC6_MEMBER_TEMPLATES) 102 (gdb) up #8 destroy<boost::intrusive_ptr<BlueStore::Onode> > (this=0x55aa8e89d578, __p=0x55aa49c74c20) at /opt/rh/devtoolset-8/root/usr/include/c++/8/ext/new_allocator.h:140 140 /opt/rh/devtoolset-8/root/usr/include/c++/8/ext/new_allocator.h: No such file or directory. (gdb) up #9 destroy<boost::intrusive_ptr<BlueStore::Onode> > (__a=..., __p=0x55aa49c74c20) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/alloc_traits.h:487 487 /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/alloc_traits.h: No such file or directory. (gdb) up #10 _M_destroy_node (this=0x55aa8e89d578, __p=0x55aa49c74c00) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_tree.h:661 661 /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_tree.h: No such file or directory. (gdb) up #11 _M_drop_node (this=0x55aa8e89d578, __p=0x55aa49c74c00) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_tree.h:669 669 in /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_tree.h (gdb) up #12 std::_Rb_tree<boost::intrusive_ptr<BlueStore::Onode>, boost::intrusive_ptr<BlueStore::Onode>, std::_Identity<boost::intrusive_ptr<BlueStore::Onode> >, std::less<boost::intrusive_ptr<BlueStore::Onode> >, std::allocator<boost::intrusive_ptr<BlueStore::Onode> > >::_M_erase ( this=this@entry=0x55aa8e89d578, __x=0x55aa49c74c00) at /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_tree.h:1874 1874 in /opt/rh/devtoolset-8/root/usr/include/c++/8/bits/stl_tree.h (gdb) up #13 0x000055a9e8882aa7 in ~_Rb_tree (this=0x55aa8e89d578, __in_chrg=<optimized out>) at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.h:1595 1595 delete deferred_txn; (gdb) list 1590 if (on_commits) { 1591 oncommits.swap(*on_commits); 1592 } 1593 } 1594 ~TransContext() { 1595 delete deferred_txn; 1596 } 1597 1598 void write_onode(OnodeRef &o) { 1599 onodes.insert(o); (gdb)
Updated by Igor Fedotov over 2 years ago
- Related to Bug #50788: crash in BlueStore::Onode::put() added
Updated by Igor Fedotov over 2 years ago
Dan van der Ster wrote:
We've just seen this crash in the wild running 15.2.14. Maybe a dup of #50788?
I'm pretty sure it is...
Aren't there any indications of a recent PG split?
Updated by Dan van der Ster over 2 years ago
Igor Fedotov wrote:
Dan van der Ster wrote:
We've just seen this crash in the wild running 15.2.14. Maybe a dup of #50788?
I'm pretty sure it is...
Aren't there any indications of a recent PG split?
Not recently AFAIK... we have nopgchange set on all the pools.
Updated by Dan van der Ster over 2 years ago
More context: the cluster was upgraded from 14.2.20 to 15.2.14 two weeks ago. We've never seen this before today; it happened only once on only this OSD so far.
Updated by Dan van der Ster over 2 years ago
In frame 7 I can print the Onode. Some of the vals look quite strange (but I don't know if that's normal):
(gdb) f #7 ~intrusive_ptr (this=0x55aa49c74c20, __in_chrg=<optimized out>) at /usr/src/debug/ceph-15.2.14/build/boost/include/boost/smart_ptr/intrusive_ptr.hpp:98 98 if( px != 0 ) intrusive_ptr_release( px ); (gdb) list 93 if( px != 0 ) intrusive_ptr_add_ref( px ); 94 } 95 96 ~intrusive_ptr() 97 { 98 if( px != 0 ) intrusive_ptr_release( px ); 99 } 100 101 #if !defined(BOOST_NO_MEMBER_TEMPLATES) || defined(BOOST_MSVC6_MEMBER_TEMPLATES) 102 (gdb) p px $11 = (BlueStore::Onode *) 0x55aa7ea2b440 (gdb) p *px $12 = {nref = {<std::__atomic_base<int>> = {static _S_alignment = 4, _M_i = 1024138560}, static is_always_lock_free = true}, c = 0x200, oid = {hobj = {static POOL_META = -1, static POOL_TEMP_START = -2, oid = { name = <error reading variable: Cannot access memory at address 0x55aaffffffe7>}, snap = {val = 8295752894954156584}, hash = 543712117, max = 102, nibblewise_key_cache = 544370464, hash_reverse_bits = 1701996900, pool = 521610949731, nspace = "cta-cristina", key = ""}, generation = 18446744073709551615, shard_id = { id = -1 '\377', static NO_SHARD = {id = -1 '\377', static NO_SHARD = <same as static member of an already seen type>}}, max = false, static NO_GEN = 18446744073709551615}, key = "", lru_item = {<boost::intrusive::generic_hook<(boost::intrusive::algo_types)0, boost::intrusive::list_node_traits<void*>, boost::intrusive::member_tag, (boost::intrusive::link_mode_type)1, (boost::intrusive::base_hook_type)0>> = {<boost::intrusive::list_node<void*>> = {next_ = 0x0, prev_ = 0x0}, <boost::intrusive::hook_tags_definer<boost::intrusive::generic_hook<(boost::intrusive::algo_types)0, boost::intrusive::list_node_traits<void*>, boost::intrusive::member_tag, (boost::intrusive::link_mode_type)1, (boost::intrusive::base_hook_type)0>, 0>> = {<No data fields>}, <No data fields>}, <No data fields>}, onode = {nid = 0, size = 0, attrs = std::map with 0 elements, extent_map_shards = std::vector of length 0, capacity 0, expected_object_size = 0, expected_write_size = 0, alloc_hint_flags = 0, flags = 0 '\000'}, exists = false, cached = false, pinned = {_M_base = {static _S_alignment = 1, _M_i = false}, static is_always_lock_free = true}, extent_map = {onode = 0x55aa7ea2b440, extent_map = {<boost::intrusive::set_impl<boost::intrusive::bhtraits<BlueStore::Extent, boost::intrusive::rbtree_node_traits<void*, true>, (boost::intrusive::link_mode_type)1, boost::intrusive::dft_tag, 3>, void, void, unsigned long, true, void>> = {<boost::intrusive::bstree_impl<boost::intrusive::bhtraits<BlueStore::Extent, boost::intrusive::rbtree_node_traits<void*, true>, (boost::intrusive::link_mode_type)1, boost::intrusive::dft_tag, 3>, void, void, unsigned long, true, (boost::intrusive::algo_types)5, void>> = {<boost::intrusive::bstbase<boost::intrusive::bhtraits<BlueStore::Extent, boost::intrusive::rbtree_node_traits<void*, true>, (boost::intrusive::link_mode_type)1, boost::intrusive::dft_tag, 3>, void, void, true, unsigned long, (boost::intrusive::algo_types)5, void>> = {<boost::intrusive::bstbase_hack<boost::intrusive::bhtraits<BlueStore::Extent, boost::intrusive::rbtree_node_traits<void*, true>, (boost::intrusive::link_mode_type)1, boost::intrusive::dft_tag, 3>, void, void, true, unsigned long, (boost::intrusive::algo_types)5, void>> = {<boost::intrusive::detail::size_holder<true, unsigned long, void>> = { static constant_time_size = <optimized out>, size_ = 0}, <boost::intrusive::bstbase2<boost::intrusive::bhtraits<BlueStore::Extent, boost::intrusive::rbtree_node_traits<void*, true>, (boost::intrusive::link_mode_type)1, boost::intrusive::dft_tag, 3>, void, void, (boost::intrusive::algo_types)5, void>> = {<boost::intrusive::detail::ebo_functor_holder<boost::intrusive::tree_value_compare<BlueStore::Extent*, std::less<BlueStore::Extent>, boost::move_detail::identity<BlueStore::Extent>, bool, true>, void, false>> = {<boost::intrusive::tree_value_compare<BlueStore::Extent*, std::less<BlueStore::Extent>, boost::move_detail::identity<BlueStore::Extent>, bool, true>> = {<boost::intrusive::detail::ebo_functor_holder<std::less<BlueStore::Extent>, void, false>> = {<std::less<BlueStore::Extent>> = {<std::binary_function<BlueStore::Extent, BlueStore::Extent, bool>> = {<No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <No data fields>}, <boost::intrusive::bstbase3<boost::intrusive::bhtraits<BlueStore::Extent, boost::intrusive::rbtree_node_traits<void*, true>, (boost::intrusive::link_mode_type)1, boost::intrusive::dft_tag, 3>, (boost::intrus---Type <return> to continue, or q <return> to quit--- ive::algo_types)5, void>> = {static safemode_or_autounlink = <optimized out>, static stateful_value_traits = <optimized out>, static has_container_from_iterator = <optimized out>, holder = {<boost::intrusive::bhtraits<BlueStore::Extent, boost::intrusive::rbtree_node_traits<void*, true>, (boost::intrusive::link_mode_type)1, boost::intrusive::dft_tag, 3>> = {<boost::intrusive::bhtraits_base<BlueStore::Extent, boost::intrusive::compact_rbtree_node<void*>*, boost::intrusive::dft_tag, 3>> = {<No data fields>}, static link_mode = boost::intrusive::safe_link}, root = {<boost::intrusive::compact_rbtree_node<void*>> = {parent_ = 0x0, left_ = 0x55aa7ea2b540, right_ = 0x55aa7ea2b540}, <No data fields>}}}, <No data fields>}, <No data fields>}, <No data fields>}, static constant_time_size = true, static stateful_value_traits = <optimized out>, static safemode_or_autounlink = true}, static constant_time_size = true}, <No data fields>}, spanning_blob_map = std::map with 0 elements, shards = std::vector of length 0, capacity 0, inline_bl = {_buffers = {_root = { next = 0x55aa7ea2b5c0}, _tail = 0x55aa7ea2b5c0}, _carriage = 0x55a9f17a8d90 <ceph::buffer::v15_2_0::list::always_empty_bptr>, _len = 0, _num = 0, static always_empty_bptr = {_raw = 0x0, _off = 0, _len = 0}}, needs_reshard_begin = 0, needs_reshard_end = 0}, flushing_count = {<std::__atomic_base<int>> = {static _S_alignment = 4, _M_i = 0}, static is_always_lock_free = true}, waiting_count = {<std::__atomic_base<int>> = { static _S_alignment = 4, _M_i = 0}, static is_always_lock_free = true}, flush_lock = {<std::__mutex_base> = {_M_mutex = {__data = {__lock = 0, __count = 0, __owner = 0, __nusers = 0, __kind = 0, __spins = 0, __elision = 0, __list = { __prev = 0x0, __next = 0x0}}, __size = '\000' <repeats 39 times>, __align = 0}}, <No data fields>}, flush_cond = {_M_cond = {__data = {__lock = 1, __futex = 0, __total_seq = 18446744073709551615, __wakeup_seq = 0, __woken_seq = 0, __mutex = 0x0, __nwaiters = 0, __broadcast_seq = 0}, __size = "\001\000\000\000\000\000\000\000\377\377\377\377\377\377\377\377", '\000' <repeats 31 times>, __align = 1}}} (gdb)
E.g. down in frame 5, `c` has address 0x200 ?!!
(gdb) f #5 BlueStore::Onode::put (this=0x55aa7ea2b440) at /usr/src/debug/ceph-15.2.14/src/os/bluestore/BlueStore.cc:3588 3588 ocs->lock.lock(); (gdb) list 3583 ocs->lock.lock(); 3584 // It is possible that during waiting split_cache moved us to different OnodeCacheShard. 3585 while (ocs != c->get_onode_cache()) { 3586 ocs->lock.unlock(); 3587 ocs = c->get_onode_cache(); 3588 ocs->lock.lock(); 3589 } 3590 bool need_unpin = pinned; 3591 pinned = pinned && nref > 2; // intentionally use > not >= as we have 3592 // +1 due to pinned state (gdb) p c $16 = (BlueStore::Collection *) 0x200 (gdb) p *c Cannot access memory at address 0x200
Updated by Igor Fedotov over 2 years ago
- Status changed from New to In Progress
- Pull request ID set to 43770
Updated by Igor Fedotov over 2 years ago
- Status changed from In Progress to Pending Backport
Updated by Igor Fedotov over 2 years ago
- Status changed from Pending Backport to Fix Under Review
Updated by Igor Fedotov over 2 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Backport Bot over 2 years ago
- Copied to Backport #53608: pacific: crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext added
Updated by Backport Bot over 2 years ago
- Copied to Backport #53609: octopus: crash BlueStore::Onode::put from BlueStore::TransContext::~TransContext added
Updated by Igor Fedotov over 2 years ago
- Status changed from Pending Backport to Resolved
Updated by Igor Fedotov almost 2 years ago
- Is duplicate of Bug #56174: rook-ceph-osd crash randomly added
Updated by Igor Fedotov almost 2 years ago
- Is duplicate of Bug #54727: crash: __pthread_mutex_lock() added
Updated by Igor Fedotov almost 2 years ago
- Is duplicate of Bug #56200: crash: ceph::buffer::ptr::release() added
Updated by Igor Fedotov almost 2 years ago
- Is duplicate of Bug #54650: crash: BlueStore::Onode::put() added
Updated by Igor Fedotov almost 2 years ago
- Related to Bug #47740: OSD crash when increase pg_num added
Updated by Anonymous almost 2 years ago
according to https://lists.ceph.io/hyperkitty/list/ceph-users@ceph.io/message/PPWIFPEI3EVBU3GQYYO6ABGF23WR5SGZ/ this is not resolved yet, could this be reopened, please?
Updated by Igor Fedotov almost 2 years ago
- Status changed from Resolved to New
Looks like this hasn't been completely fixed yet.
We've got a bunch of new tickets from Telemetry bot which indicate the same or similar symptoms (Onode::put is primarily involved) for Ceph releases which had got PR #43770 (and its backports).
Some of the cases from the field I observed personally:
1) 15.2.16
Aug 05 23:34:51 ceph-osd2861: ** Caught signal (Segmentation fault) *
Aug 05 23:34:51 ceph-osd2861: in thread 7f08cf3a0700 thread_name:tp_osd_tp
Aug 05 23:34:51 ceph-osd2861: ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)
Aug 05 23:34:51 ceph-osd2861: 1: (()+0x12730) [0x7f08ec91e730]
Aug 05 23:34:51 ceph-osd2861: 2: (ceph::buffer::v15_2_0::ptr::release()+0x26) [0x5650f3904d26]
Aug 05 23:34:51 ceph-osd2861: 3: (BlueStore::Onode::put()+0x1a9) [0x5650f35b6a79]
Aug 05 23:34:51 ceph-osd2861: 4: (std::_Hashtable<ghobject_t, std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> > >, std::__detail::_Select1st, std::equal_to<ghobject_t>, std::hash<ghobject_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_erase(unsigned long, std::__detail::_Hash_node_base*, std::__detail::_Hash_node<std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, true>)+0x64) [0x5650f3662ca4]
Aug 05 23:34:51 ceph-osd2861: 5: (BlueStore::OnodeSpace::_remove(ghobject_t const&)+0x290) [0x5650f35b68a0]
Aug 05 23:34:51 ceph-osd2861: 6: (LruOnodeCacheShard::_trim_to(unsigned long)+0xdb) [0x5650f36631db]
Aug 05 23:34:51 ceph-osd2861: 7: (BlueStore::OnodeSpace::add(ghobject_t const&, boost::intrusive_ptr<BlueStore::Onode>&)+0x48d) [0x5650f35b74cd]
Aug 05 23:34:51 ceph-osd2861: 8: (BlueStore::Collection::get_onode(ghobject_t const&, bool, bool)+0x453) [0x5650f35fdac3]
Aug 05 23:34:51 ceph-osd2861: 9: (BlueStore::_txc_add_transaction(BlueStore::TransContext, ceph::os::Transaction*)+0x1dc3) [0x5650f3633353]
Aug 05 23:34:51 ceph-osd2861: 10: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x408) [0x5650f3634778]
Aug 05 23:34:51 ceph-osd2861: 11: (non-virtual thunk to PrimaryLogPG::queue_transactions(std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x54) [0x5650f32e7c14]
Aug 05 23:34:51 ceph-osd2861: 12: (ReplicatedBackend::do_repop(boost::intrusive_ptr<OpRequest>)+0xdf4) [0x5650f347b804]
Aug 05 23:34:51 ceph-osd2861: 13: (ReplicatedBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x267) [0x5650f348ad57]
Aug 05 23:34:51 ceph-osd2861: 14: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x57) [0x5650f331d917]
Aug 05 23:34:51 ceph-osd2861: 15: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x62f) [0x5650f32c14df]
Aug 05 23:34:51 ceph-osd2861: 16: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x325) [0x5650f3159d35]
Aug 05 23:34:51 ceph-osd2861: 17: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x64) [0x5650f339dea4]
Aug 05 23:34:51 ceph-osd2861: 18: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x12fa) [0x5650f317678a]
Aug 05 23:34:51 ceph-osd2861: 19: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5b4) [0x5650f37801f4]
Aug 05 23:34:51 ceph-osd2861: 20: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5650f3782c70]
Aug 05 23:34:51 ceph-osd2861: 21: (()+0x7fa3) [0x7f08ec913fa3]
Aug 05 23:34:51 ceph-osd2861: 22: (clone()+0x3f) [0x7f08ec4beeff]
or
Aug 05 00:33:29 ceph-osd2863: ** Caught signal (Segmentation fault) *
Aug 05 00:33:29 ceph-osd2863: in thread 7f4613a22700 thread_name:bstore_kv_final
Aug 05 00:33:29 ceph-osd2863: ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)
Aug 05 00:33:29 ceph-osd2863: 1: (()+0x12730) [0x7f461ff7e730]
Aug 05 00:33:29 ceph-osd2863: 2: (BlueStore::Onode::put()+0x193) [0x564c15db8a63]
Aug 05 00:33:29 ceph-osd2863: 3: (std::_Rb_tree<boost::intrusive_ptr<BlueStore::Onode>, boost::intrusive_ptr<BlueStore::Onode>, std::_Identity<boost::intrusive_ptr<BlueStore::Onode> >, std::less<boost::intrusive_ptr<BlueStore::Onode> >, std::allocator<boost::intrusive_ptr<BlueStore::Onode> > >::_M_erase(std::_Rb_tree_node<boost::intrusive_ptr<BlueStore::Onode> >)+0x2d) [0x564c15e6460d]
Aug 05 00:33:29 ceph-osd2863: 4: (BlueStore::TransContext::~TransContext()+0x117) [0x564c15e64747]
Aug 05 00:33:29 ceph-osd2863: 5: (BlueStore::_txc_finish(BlueStore::TransContext)+0x24b) [0x564c15e0bb8b]
Aug 05 00:33:29 ceph-osd2863: 6: (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x234) [0x564c15e23744]
Aug 05 00:33:29 ceph-osd2863: 7: (BlueStore::_kv_finalize_thread()+0x552) [0x564c15e2e3e2]
Aug 05 00:33:29 ceph-osd2863: 8: (BlueStore::KVFinalizeThread::entry()+0xd) [0x564c15e69b8d]
Aug 05 00:33:29 ceph-osd2863: 9: (()+0x7fa3) [0x7f461ff73fa3]
Aug 05 00:33:29 ceph-osd2863: 10: (clone()+0x3f) [0x7f461fb1eeff]
2) different cluster at 15.2.16
backtrace:
0: (()+0x12730) [0x7fe8875d1730]
1: (gsignal()+0x10b) [0x7fe8870b07bb]
2: (abort()+0x121) [0x7fe88709b535]
3: (()+0x2240f) [0x7fe88709b40f]
4: (()+0x30102) [0x7fe8870a9102]
5: (()+0xeb47ca) [0x55e2237177ca]
6: (BlueStore::Onode::put()+0x2b1) [0x55e22372ab81]
7: (std::_Rb_tree<boost::intrusive_ptrBlueStore::Onode, boost::intrusive_ptrBlueStore::Onode, std::_Identity<boost::intrusive_ptrBlueStore::Onode >, std::less<boost::intrusive_ptrBlueStore::Onode >, std::allocator<boost::intrusive_ptrBlueStore::Onode > >::_M_erase(std::_Rb_tree_node<boost::intrusive_ptrBlueStore::Onode >)+0x2d) [0x55e2237d660d]
8: (BlueStore::TransContext::~TransContext()+0x124) [0x55e2237d6754]
9: (BlueStore::_txc_finish(BlueStore::TransContext)+0x24b) [0x55e22377db8b]
10: (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x234) [0x55e223795744]
11: (BlueStore::_kv_finalize_thread()+0x552) [0x55e2237a03e2]
12: (BlueStore::KVFinalizeThread::entry()+0xd) [0x55e2237dbb8d]
13: (()+0x7fa3) [0x7fe8875c6fa3]
14: (clone()+0x3f) [0x7fe887171eff]
3) 16.2.9
Caught signal (Segmentation fault) *
2022-08-02 00:33:00 Ceph04 osd.21 in thread 7f2853f74700 thread_name:tp_osd_tp
2022-08-02 00:33:00 Ceph04 osd.21 ceph version 16.2.9 (4c3647a322c0ff5a1dd2344e039859dcbd28c830) pacific (stable)
2022-08-02 00:33:00 Ceph04 osd.21 1: /lib64/libpthread.so.0(+0x168c0) [0x7f287a1e98c0]
2022-08-02 00:33:00 Ceph04 osd.21 2: (ceph::buffer::v15_2_0::ptr::release()+0xf) [0x55670639336f]
2022-08-02 00:33:00 Ceph04 osd.21 3: (BlueStore::Onode::put()+0x1bc) [0x55670601feac]
2022-08-02 00:33:00 Ceph04 osd.21 4: (std::_detail::_Hashtable_alloc<mempool::pool_allocator >, true> > >::_M_deallocate_node(std::_detail::_Hash_node<std::pair >, true>)+0x35) [0x5567060d2365]</std::pair</mempool::pool_allocator
2022-08-02 00:33:00 Ceph04 osd.21 5: (std::Hashtable >, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair > >, std::detail::_Select1st, std::equal_to, std::hash, std::detail::_Mod_range_hashing, std::detail::_Default_ranged_hash, std::detail::_Prime_rehash_policy, std::detail::_Hashtable_traits >::_M_erase(unsigned long, std::detail::_Hash_node_base*, std::_detail::_Hash_node<std::pair >, true>)+0x53) [0x5567060d27a3]</std::pair
2022-08-02 00:33:00 Ceph04 osd.21 6: (BlueStore::OnodeSpace::_remove(ghobject_t const&)+0x12c) [0x55670601fb5c]
2022-08-02 00:33:00 Ceph04 osd.21 7: (LruOnodeCacheShard::_trim_to(unsigned long)+0xce) [0x5567060d350e]
2022-08-02 00:33:00 Ceph04 osd.21 8: (BlueStore::OnodeSpace::add(ghobject_t const&, boost::intrusive_ptr&)+0x152) [0x5567060206a2]
2022-08-02 00:33:00 Ceph04 osd.21 9: (BlueStore::Collection::get_onode(ghobject_t const&, bool, bool)+0x299) [0x55670607fc39]
2022-08-02 00:33:00 Ceph04 osd.21 10: (BlueStore::_txc_add_transaction(BlueStore::TransContext, ceph::os::Transaction*)+0x1d32) [0x55670608b722]
2022-08-02 00:33:00 Ceph04 osd.21 11: (BlueStore::queue_transactions(boost::intrusive_ptr&, std::vector >&, boost::intrusive_ptr, ThreadPool::TPHandle*)+0x2fa) [0x5567060a555a]
2022-08-02 00:33:00 Ceph04 osd.21 12: (non-virtual thunk to PrimaryLogPG::queue_transactions(std::vector >&, boost::intrusive_ptr)+0x54) [0x556705ce5cf4]
2022-08-02 00:33:00 Ceph04 osd.21 13: (ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr, ECSubWrite&, ZTracer::Trace const&)+0xa4d) [0x556705eff87d]
2022-08-02 00:33:00 Ceph04 osd.21 14: (ECBackend::try_reads_to_commit()+0x2509) [0x556705f10759]
2022-08-02 00:33:00 Ceph04 osd.21 15: (ECBackend::check_ops()+0x1c) [0x556705f1202c]
2022-08-02 00:33:00 Ceph04 osd.21 16: (ECBackend::handle_sub_write_reply(pg_shard_t, ECSubWriteReply const&, ZTracer::Trace const&)+0xde) [0x556705f1217e]
2022-08-02 00:33:00 Ceph04 osd.21 17: (ECBackend::_handle_message(boost::intrusive_ptr)+0x1cf) [0x556705f17cef]
2022-08-02 00:33:00 Ceph04 osd.21 18: (PGBackend::handle_message(boost::intrusive_ptr)+0x87) [0x556705d34117]
2022-08-02 00:33:00 Ceph04 osd.21 19: (PrimaryLogPG::do_request(boost::intrusive_ptr&, ThreadPool::TPHandle&)+0x684) [0x556705cd5264]
2022-08-02 00:33:00 Ceph04 osd.21 20: (OSD::dequeue_op(boost::intrusive_ptr, boost::intrusive_ptr, ThreadPool::TPHandle&)+0x159) [0x556705b5ee39]
2022-08-02 00:33:00 Ceph04 osd.21 21: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr&, ThreadPool::TPHandle&)+0x67) [0x556705dbaef7]
2022-08-02 00:33:00 Ceph04 osd.21 22: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xcf5) [0x556705b7c625]
2022-08-02 00:33:00 Ceph04 osd.21 23: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x5567061e02ec]
2022-08-02 00:33:00 Ceph04 osd.21 24: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5567061e37b0]
2022-08-02 00:33:00 Ceph04 osd.21 25: /lib64/libpthread.so.0(+0xa6ea) [0x7f287a1dd6ea]
2022-08-02 00:33:00 Ceph04 osd.21 26: clone()
Updated by Igor Fedotov almost 2 years ago
4) Quincy case from Telemetry: https://tracker.ceph.com/issues/56382
Updated by Igor Fedotov over 1 year ago
- Status changed from New to In Progress
Another PR: https://github.com/ceph/ceph/pull/47702
Updated by Anonymous over 1 year ago
We have almost daily crashes on our octopus cluster, which are also reported via telemetry, which look like this bug, could you confirm that these are the same, or if you need more information, just ask. I'm really waiting on a patch for this:
{ "backtrace": [ "(()+0x12980) [0x7f269ac06980]", "(ceph::buffer::v15_2_0::ptr::release()+0x26) [0x55fc3e524206]", "(BlueStore::Onode::put()+0x1c1) [0x55fc3e192a71]", "(std::_Rb_tree<boost::intrusive_ptr<BlueStore::Onode>, boost::intrusive_ptr<BlueStore::Onode>, std::_Identity<boost::intrusive_ptr<BlueStore::Onode> >, std::less<boost::intrusive_ptr<BlueStore::Onode> >, std::allocator<boost::intrusive_ptr<BlueStore::Onode> > >::_M_erase(std::_Rb_tree_node<boost::intrusive_ptr<BlueStore::Onode> >*)+0x2d) [0x55fc3e248a0d]", "(std::_Rb_tree<boost::intrusive_ptr<BlueStore::Onode>, boost::intrusive_ptr<BlueStore::Onode>, std::_Identity<boost::intrusive_ptr<BlueStore::Onode> >, std::less<boost::intrusive_ptr<BlueStore::Onode> >, std::allocator<boost::intrusive_ptr<BlueStore::Onode> > >::_M_erase(std::_Rb_tree_node<boost::intrusive_ptr<BlueStore::Onode> >*)+0x1b) [0x55fc3e2489fb]", "(BlueStore::TransContext::~TransContext()+0x124) [0x55fc3e248b54]", "(BlueStore::_txc_finish(BlueStore::TransContext*)+0x4b8) [0x55fc3e1d01b8]", "(BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x24c) [0x55fc3e1d1b7c]", "(BlueStore::_kv_finalize_thread()+0x48c) [0x55fc3e21b58c]", "(BlueStore::KVFinalizeThread::entry()+0xd) [0x55fc3e24d09d]", "(()+0x76db) [0x7f269abfb6db]", "(clone()+0x3f) [0x7f269999b61f]" ], "ceph_version": "15.2.17", "crash_id": "2022-10-21T16:26:38.286992Z_ba5ffc75-58c3-45fc-9cda-950256b5efca", "entity_name": "osd.127", "os_id": "ubuntu", "os_name": "Ubuntu", "os_version": "18.04.6 LTS (Bionic Beaver)", "os_version_id": "18.04", "process_name": "ceph-osd", "stack_sig": "b2e4aac01a4b8acbb3878c39b0f5b1269edcccb6a90435e54b6958716a9e703e", "timestamp": "2022-10-21T16:26:38.286992Z", "utsname_hostname": "ceph-osd08", "utsname_machine": "x86_64", "utsname_release": "5.4.0-107-generic", "utsname_sysname": "Linux", "utsname_version": "#121~18.04.1-Ubuntu SMP Thu Mar 24 17:21:33 UTC 2022" }
Updated by Yaarit Hatuka over 1 year ago
Hi Sven,
Thanks for reporting telemetry! The issue you reported is tracked in https://tracker.ceph.com/issues/56200, which is marked as a duplicate to this tracker (https://tracker.ceph.com/issues/53002), so indeed they are the same.
Looks like the Octopus backport is already merged, but there is another PR (https://github.com/ceph/ceph/pull/47702) which is still under review and not yet merged to main.
Regards,
Yaarit
Updated by 王子敬 wang over 1 year ago
(gdb) bt
#0 0x00007fc82cdb64aa in tc_newarray () from /lib64/libtcmalloc.so.4
#1 0x000055f6876050ba in ceph::buffer::v15_2_0::ptr_node::create<ceph::buffer::v15_2_0::ptr_node const&> ()
at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/include/buffer.h:411
#2 ceph::buffer::v15_2_0::list::append (this=this@entry=0x55f6b308ceb8, bl=...) at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/common/buffer.cc:1424
#3 0x000055f687150491 in ceph::encode (bl=..., s=...) at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/include/encoding.h:282
#4 ceph::os::Transaction::encode (this=this@entry=0x7fc8039c7440, bl=...) at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/os/Transaction.h:1267
#5 0x000055f687137698 in ceph::os::encode (features=0, bl=..., c=...) at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/os/Transaction.h:1293
#6 ReplicatedBackend::generate_subop (this=0x55f6956f8180, soid=..., at_version=..., tid=10598176, reqid=..., pg_trim_to=..., min_last_complete_ondisk=..., new_temp_oid=...,
discard_temp_oid=..., log_entries=..., hset_hist=std::optional<pg_hit_set_history_t> [no contained value], op_t=..., peer=..., pinfo=...)
at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/osd/ReplicatedBackend.cc:968
#7 0x000055f687138188 in ReplicatedBackend::issue_op (this=0x55f6956f8180, soid=..., at_version=..., tid=<optimized out>, reqid=..., pg_trim_to=..., min_last_complete_ondisk=...,
new_temp_oid=..., discard_temp_oid=..., log_entries=..., hset_hist=..., op=<optimized out>, op_t=...)
at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/osd/ReplicatedBackend.cc:1028
#8 0x000055f68713ad14 in ReplicatedBackend::submit_transaction (this=0x55f6956f8180, soid=..., delta_stats=..., at_version=..., _t=..., trim_to=..., min_last_complete_ondisk=...,
_log_entries=std::vector of length 1, capacity 1 = {...}, hset_history=std::optional<pg_hit_set_history_t> [no contained value], on_all_commit=0x55f6bfd47360, tid=10598176,
reqid=..., orig_op=...) at /usr/include/c++/8/ext/aligned_buffer.h:76
#9 0x000055f686f07ce0 in PrimaryLogPG::issue_repop (this=0x55f6961c4000, repop=0x55f696e73980, ctx=0x55f6dc76d200)
at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/osd/PeeringState.h:2292
#10 0x000055f686f64c5a in PrimaryLogPG::execute_ctx (this=0x55f6961c4000, ctx=<optimized out>)
at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/osd/PrimaryLogPG.cc:4166
#11 0x000055f686f69004 in PrimaryLogPG::do_op (this=0x55f6961c4000, op=...) at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/osd/PrimaryLogPG.cc:2381
#12 0x000055f686f76585 in PrimaryLogPG::do_request (this=0x55f6961c4000, op=..., handle=...) at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/osd/PrimaryLogPG.cc:1779
#13 0x000055f686df35d9 in OSD::dequeue_op (this=this@entry=0x55f692652000, pg=..., op=..., handle=...)
at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/osd/OSD.cc:9754
#14 0x000055f68705b378 in ceph::osd::scheduler::PGOpItem::run (this=<optimized out>, osd=0x55f692652000, sdata=<optimized out>, pg=..., handle=...)
at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/osd/PG.h:627
#15 0x000055f686e0ff4b in ceph::osd::scheduler::OpSchedulerItem::run (handle=..., pg=..., sdata=<optimized out>, osd=<optimized out>, this=0x7fc8039c83b0)
at /usr/include/c++/8/bits/unique_ptr.h:345
#16 OSD::ShardedOpWQ::_process (this=<optimized out>, thread_index=<optimized out>, hb=<optimized out>)
at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/osd/OSD.cc:10788
#17 0x000055f687465644 in ShardedThreadPool::shardedthreadpool_worker (this=0x55f692652a28, thread_index=11)
at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/common/WorkQueue.cc:311
#18 0x000055f6874682a4 in ShardedThreadPool::WorkThreadSharded::entry (this=<optimized out>) at /usr/src/debug/ceph-15.2.13-branch_2212260918.el8.x86_64/src/common/WorkQueue.h:715
#19 0x00007fc82c26014a in start_thread () from /lib64/libpthread.so.0
#20 0x00007fc82b3c9dc3 in clone () from /lib64/libc.so.6
ceph_version 15.2.13
We've just seen some crash running 15.2.13
Updated by Igor Fedotov over 1 year ago
- Has duplicate Bug #58439: octopus osd crash added
Updated by Igor Fedotov over 1 year ago
- Pull request ID changed from 43770 to 47702
Updated by Igor Fedotov over 1 year ago
- Status changed from In Progress to Fix Under Review
Updated by Igor Fedotov about 1 year ago
- Has duplicate Bug #56382: ONode ref counting is broken added
Updated by Igor Fedotov about 1 year ago
- Status changed from Fix Under Review to Duplicate
Updated by Igor Fedotov about 1 year ago
- Has duplicate deleted (Bug #56382: ONode ref counting is broken)
Updated by Igor Fedotov about 1 year ago
- Has duplicate Bug #56382: ONode ref counting is broken added