Project

General

Profile

Actions

Bug #58747

open

/build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: 3945: ceph_abort_msg("uh oh, missing shared_blob")

Added by Anonymous about 1 year ago. Updated about 1 year ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

I have a crashing OSD with a crash message I have never seen before. Also this OSD Unit did not recover from the crash, so maybe this is a bug.

The only reference I found is https://tracker.ceph.com/issues/37090, but that should be fixed for a long time, so I open a new report instead.

This is an all flash (ssd) deployment with 2 OSD per Device. rocks.db/wal metadata is on separate NVME devices, however not for this crashing OSD (I'm currently redeploying all NVME devices for unrelated reasons).

the other OSD on this device still runs just fine, as far as I can see, so I think it should not be hardware related.

The device in question is a Micron 5400.

Here is the first crash dump, obtained via ceph crash info, the process then restarted automatically via systemd some more and crashed again each time:

ceph crash info 2023-02-15T18:45:05.298372Z_0ef42246-8294-4737-8235-a796f904b518
{
    "backtrace": [
        "(()+0x12980) [0x7f1efb225980]",
        "(std::_Rb_tree<unsigned long, std::pair<unsigned long const, bluestore_extent_ref_map_t::record_t>, std::_Select1st<std::pair<unsigned long const, bluestore_extent_ref_map_t::record_t> >, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)5, std::pair<unsigned long const, bluestore_extent_ref_map_t::record_t> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, bluestore_extent_ref_map_t::record_t> >*)+0x22) [0x564fa034fd12]",
        "(BlueStore::SharedBlob::~SharedBlob()+0x25) [0x564fa02a46f5]",
        "(BlueStore::SharedBlob::put()+0xa3) [0x564fa02b12e3]",
        "(BlueStore::Blob::put()+0x88) [0x564fa0366f58]",
        "(BlueStore::Extent::~Extent()+0x3c) [0x564fa0366fac]",
        "(BlueStore::Onode::put()+0x33b) [0x564fa02b3beb]",
        "(std::__detail::_Hashtable_alloc<mempool::pool_allocator<(mempool::pool_index_t)4, std::__detail::_Hash_node<std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, true> > >::_M_deallocate_node(std::__detail::_Hash_node<std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, true>*)+0x35) [0x564fa03690c5]",
        "(std::_Hashtable<ghobject_t, std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> > >, std::__detail::_Select1st, std::equal_to<ghobject_t>, std::hash<ghobject_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_erase(unsigned long, std::__detail::_Hash_node_base*, std::__detail::_Hash_node<std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, true>*)+0x53) [0x564fa0369503]",
        "(BlueStore::OnodeSpace::_remove(ghobject_t const&)+0x12c) [0x564fa02b371c]",
        "(LruOnodeCacheShard::_trim_to(unsigned long)+0xce) [0x564fa036a00e]",
        "(BlueStore::OnodeSpace::add(ghobject_t const&, boost::intrusive_ptr<BlueStore::Onode>&)+0x152) [0x564fa02b42d2]",
        "(BlueStore::Collection::get_onode(ghobject_t const&, bool, bool)+0x384) [0x564fa0306254]",
        "(BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x1d94) [0x564fa033a0d4]",
        "(BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x564fa033b50a]",
        "(non-virtual thunk to PrimaryLogPG::queue_transactions(std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x54) [0x564f9ffb1c84]",
        "(ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace const&)+0x9cd) [0x564fa018c95d]",
        "(ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x23d) [0x564fa01a4e1d]",
        "(PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x564f9fffce77]",
        "(PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x564f9ffa0b2d]",
        "(OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x564f9fe2494b]",
        "(ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x564fa007fde7]",
        "(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x564f9fe421a9]",
        "(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x564fa04959dc]",
        "(ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x564fa0498e40]",
        "(()+0x76db) [0x7f1efb21a6db]",
        "(clone()+0x3f) [0x7f1ef9fba61f]" 
    ],
    "ceph_version": "15.2.17",
    "crash_id": "2023-02-15T18:45:05.298372Z_0ef42246-8294-4737-8235-a796f904b518",
    "entity_name": "osd.74",
    "os_id": "ubuntu",
    "os_name": "Ubuntu",
    "os_version": "18.04.6 LTS (Bionic Beaver)",
    "os_version_id": "18.04",
    "process_name": "ceph-osd",
    "stack_sig": "10eb87ac0d54813d6b569a2eb0c293346476b3358d5ee188a0726bcfa9107e22",
    "timestamp": "2023-02-15T18:45:05.298372Z",
    "utsname_hostname": "ceph-osd05",
    "utsname_machine": "x86_64",
    "utsname_release": "5.4.0-107-generic",
    "utsname_sysname": "Linux",
    "utsname_version": "#121~18.04.1-Ubuntu SMP Thu Mar 24 17:21:33 UTC 2022" 
}

I know this is an EOL release, but I figured it might still be worthwhile to report a possible bug, which might affect newer versions as well.

if you need additional debug info, please don't hesitate to ask, I'm happy to help!

kind regards
Sven

Actions

Also available in: Atom PDF