Project

General

Profile

Actions

Bug #57895

closed

OSD crash in Onode::put()

Added by dongdong tao over 1 year ago. Updated over 1 year ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This issue happens when an Onode is being trimmed right away after it's unpinned. This is possible when the LRU list is extremely short

Below are the crash stacks (happened on unpin and trim thread):

1: (()+0x12890) [0x7f74d588a890]
2: (ceph::buffer::v15_2_0::ptr::release()+0x8) [0x555c649a9e18]
3: (BlueStore::Onode::put()+0x1c1) [0x555c6462c621]
4: (std::__detail::_Hashtable_alloc<mempool::pool_allocator<(mempool::pool_index_t)4, std::__detail::_Hash_node<std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, true> > >::_M_deallocate_node(std::__detail::_Hash_node<std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, true>)+0x35) [0x555c646dc3c5]
5: (std::_Hashtable&lt;ghobject_t, std::pair&lt;ghobject_t const, boost::intrusive_ptr&lt;BlueStore::Onode&gt; >, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair&lt;ghobject_t const, boost::intrusive_ptr&lt;BlueStore::Onode&gt; > >, std::__detail::_Select1st, std::equal_to&lt;ghobject_t&gt;, std::hash&lt;ghobject_t&gt;, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits&lt;true, false, true&gt; >::_M_erase(unsigned long, std::__detail::_Hash_node_base
, std::__detail::_Hash_node&lt;std::pair&lt;ghobject_t const, boost::intrusive_ptr&lt;BlueStore::Onode&gt; >, true>)+0x53) [0x555c646dc803]
6: (BlueStore::OnodeSpace::_remove(ghobject_t const&)+0x12c) [0x555c6462c2cc]
7: (LruOnodeCacheShard::_trim_to(unsigned long)+0xce) [0x555c646dd33e]
8: (BlueStore::OnodeSpace::add(ghobject_t const&, boost::intrusive_ptr&lt;BlueStore::Onode&gt;&)+0x152) [0x555c6462ce22]
9: (BlueStore::Collection::get_onode(ghobject_t const&, bool, bool)+0x384) [0x555c6468d5a4]
10: (BlueStore::_txc_add_transaction(BlueStore::TransContext
, ceph::os::Transaction*)+0x1c29) [0x555c64696999]
11: (BlueStore::queue_transactions(boost::intrusive_ptr&lt;ObjectStore::CollectionImpl&gt;&, std::vector&lt;ceph::os::Transaction, std::allocator&lt;ceph::os::Transaction&gt; >&, boost::intrusive_ptr&lt;TrackedOp&gt;, ThreadPool::TPHandle*)+0x2ae) [0x555c646afb4e]
12: (non-virtual thunk to PrimaryLogPG::queue_transactions(std::vector&lt;ceph::os::Transaction, std::allocator&lt;ceph::os::Transaction&gt; >&, boost::intrusive_ptr&lt;OpRequest&gt;)+0x54) [0x555c6433af54]
13: (ReplicatedBackend::do_repop(boost::intrusive_ptr&lt;OpRequest&gt;)+0xb08) [0x555c644e5f18]
14: (ReplicatedBackend::_handle_message(boost::intrusive_ptr&lt;OpRequest&gt;)+0x187) [0x555c644f6397]
15: (PGBackend::handle_message(boost::intrusive_ptr&lt;OpRequest&gt;)+0x87) [0x555c64384517]
16: (PrimaryLogPG::do_request(boost::intrusive_ptr&lt;OpRequest&gt;&, ThreadPool::TPHandle&)+0x684) [0x555c6432acd4]
17: (OSD::dequeue_op(boost::intrusive_ptr&lt;PG&gt;, boost::intrusive_ptr&lt;OpRequest&gt;, ThreadPool::TPHandle&)+0x159) [0x555c641b7229]
18: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr&lt;PG&gt;&, ThreadPool::TPHandle&)+0x67) [0x555c6440a227]
19: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x623) [0x555c641d35f3]
20: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555c64807f0c]
21: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555c6480b160]
22: (()+0x76db) [0x7f74d587f6db]
23: (clone()+0x3f) [0x7f74d55a888f]

and

ceph version 15.2.16 (d46a73d6d0a67a79558054a3a5a72cb561724974) octopus (stable)
1: (()+0x12890) [0x7ff0ee5fd890]
2: (ceph::buffer::v15_2_0::ptr::release()+0x8) [0x55f9c9954e18]
3: (BlueStore::Onode::put()+0x1c1) [0x55f9c95d7621]
4: (std::_Rb_tree&lt;boost::intrusive_ptr&lt;BlueStore::Onode&gt;, boost::intrusive_ptr&lt;BlueStore::Onode&gt;, std::_Identity&lt;boost::intrusive_ptr&lt;BlueStore::Onode&gt; >, std::less&lt;boost::intru
sive_ptr&lt;BlueStore::Onode&gt; >, std::allocator&lt;boost::intrusive_ptr&lt;BlueStore::Onode&gt; > >::_M_erase(std::_Rb_tree_node&lt;boost::intrusive_ptr&lt;BlueStore::Onode&gt; >)+0x2d) [0x55f9c9687
d0d]
5: (BlueStore::TransContext::~TransContext()+0x114) [0x55f9c9687e44]
6: (BlueStore::_txc_finish(BlueStore::TransContext
)+0x448) [0x55f9c9617788]
7: (BlueStore::_txc_state_proc(BlueStore::TransContext*)+0x24c) [0x55f9c961907c]
8: (BlueStore::_kv_finalize_thread()+0x48c) [0x55f9c965c31c]
9: (BlueStore::KVFinalizeThread::entry()+0xd) [0x55f9c968c01d]
10: (()+0x76db) [0x7ff0ee5f26db]
11: (clone()+0x3f) [0x7ff0ee31b88f]

I believe this issue is still present on the Master branch


Related issues 1 (0 open1 closed)

Is duplicate of bluestore - Bug #56382: ONode ref counting is brokenResolvedIgor Fedotov

Actions
Actions

Also available in: Atom PDF