Project

General

Profile

Actions

Bug #58747

open

/build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: 3945: ceph_abort_msg("uh oh, missing shared_blob")

Added by Anonymous about 1 year ago. Updated about 1 year ago.

Status:
Need More Info
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

I have a crashing OSD with a crash message I have never seen before. Also this OSD Unit did not recover from the crash, so maybe this is a bug.

The only reference I found is https://tracker.ceph.com/issues/37090, but that should be fixed for a long time, so I open a new report instead.

This is an all flash (ssd) deployment with 2 OSD per Device. rocks.db/wal metadata is on separate NVME devices, however not for this crashing OSD (I'm currently redeploying all NVME devices for unrelated reasons).

the other OSD on this device still runs just fine, as far as I can see, so I think it should not be hardware related.

The device in question is a Micron 5400.

Here is the first crash dump, obtained via ceph crash info, the process then restarted automatically via systemd some more and crashed again each time:

ceph crash info 2023-02-15T18:45:05.298372Z_0ef42246-8294-4737-8235-a796f904b518
{
    "backtrace": [
        "(()+0x12980) [0x7f1efb225980]",
        "(std::_Rb_tree<unsigned long, std::pair<unsigned long const, bluestore_extent_ref_map_t::record_t>, std::_Select1st<std::pair<unsigned long const, bluestore_extent_ref_map_t::record_t> >, std::less<unsigned long>, mempool::pool_allocator<(mempool::pool_index_t)5, std::pair<unsigned long const, bluestore_extent_ref_map_t::record_t> > >::_M_erase(std::_Rb_tree_node<std::pair<unsigned long const, bluestore_extent_ref_map_t::record_t> >*)+0x22) [0x564fa034fd12]",
        "(BlueStore::SharedBlob::~SharedBlob()+0x25) [0x564fa02a46f5]",
        "(BlueStore::SharedBlob::put()+0xa3) [0x564fa02b12e3]",
        "(BlueStore::Blob::put()+0x88) [0x564fa0366f58]",
        "(BlueStore::Extent::~Extent()+0x3c) [0x564fa0366fac]",
        "(BlueStore::Onode::put()+0x33b) [0x564fa02b3beb]",
        "(std::__detail::_Hashtable_alloc<mempool::pool_allocator<(mempool::pool_index_t)4, std::__detail::_Hash_node<std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, true> > >::_M_deallocate_node(std::__detail::_Hash_node<std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, true>*)+0x35) [0x564fa03690c5]",
        "(std::_Hashtable<ghobject_t, std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, mempool::pool_allocator<(mempool::pool_index_t)4, std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> > >, std::__detail::_Select1st, std::equal_to<ghobject_t>, std::hash<ghobject_t>, std::__detail::_Mod_range_hashing, std::__detail::_Default_ranged_hash, std::__detail::_Prime_rehash_policy, std::__detail::_Hashtable_traits<true, false, true> >::_M_erase(unsigned long, std::__detail::_Hash_node_base*, std::__detail::_Hash_node<std::pair<ghobject_t const, boost::intrusive_ptr<BlueStore::Onode> >, true>*)+0x53) [0x564fa0369503]",
        "(BlueStore::OnodeSpace::_remove(ghobject_t const&)+0x12c) [0x564fa02b371c]",
        "(LruOnodeCacheShard::_trim_to(unsigned long)+0xce) [0x564fa036a00e]",
        "(BlueStore::OnodeSpace::add(ghobject_t const&, boost::intrusive_ptr<BlueStore::Onode>&)+0x152) [0x564fa02b42d2]",
        "(BlueStore::Collection::get_onode(ghobject_t const&, bool, bool)+0x384) [0x564fa0306254]",
        "(BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x1d94) [0x564fa033a0d4]",
        "(BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x564fa033b50a]",
        "(non-virtual thunk to PrimaryLogPG::queue_transactions(std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<OpRequest>)+0x54) [0x564f9ffb1c84]",
        "(ECBackend::handle_sub_write(pg_shard_t, boost::intrusive_ptr<OpRequest>, ECSubWrite&, ZTracer::Trace const&)+0x9cd) [0x564fa018c95d]",
        "(ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x23d) [0x564fa01a4e1d]",
        "(PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x564f9fffce77]",
        "(PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x564f9ffa0b2d]",
        "(OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x564f9fe2494b]",
        "(ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x564fa007fde7]",
        "(OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x564f9fe421a9]",
        "(ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x564fa04959dc]",
        "(ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x564fa0498e40]",
        "(()+0x76db) [0x7f1efb21a6db]",
        "(clone()+0x3f) [0x7f1ef9fba61f]" 
    ],
    "ceph_version": "15.2.17",
    "crash_id": "2023-02-15T18:45:05.298372Z_0ef42246-8294-4737-8235-a796f904b518",
    "entity_name": "osd.74",
    "os_id": "ubuntu",
    "os_name": "Ubuntu",
    "os_version": "18.04.6 LTS (Bionic Beaver)",
    "os_version_id": "18.04",
    "process_name": "ceph-osd",
    "stack_sig": "10eb87ac0d54813d6b569a2eb0c293346476b3358d5ee188a0726bcfa9107e22",
    "timestamp": "2023-02-15T18:45:05.298372Z",
    "utsname_hostname": "ceph-osd05",
    "utsname_machine": "x86_64",
    "utsname_release": "5.4.0-107-generic",
    "utsname_sysname": "Linux",
    "utsname_version": "#121~18.04.1-Ubuntu SMP Thu Mar 24 17:21:33 UTC 2022" 
}

I know this is an EOL release, but I figured it might still be worthwhile to report a possible bug, which might affect newer versions as well.

if you need additional debug info, please don't hesitate to ask, I'm happy to help!

kind regards
Sven

Actions #1

Updated by Anonymous about 1 year ago

I also have the complete ceph-osd-74.log file, but it is 14MB compressed as tar.gz so I can't upload it directly.

Here is a crash excerpt from it:

2829+0000
/build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: 3945: ceph_abort_msg("uh oh, missing shared_blob")

 ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
 1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xe1) [0x555eb1586ffe]
 2: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x62c) [0x555eb1a941ac]
 3: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb05) [0x555eb1b01f45]
 4: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x1ec) [0x555eb1b0388c]
 5: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xd8) [0x555eb1b04338]
 6: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x85) [0x555eb1b05f95]
 7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x11b1) [0x555eb1b0b4f1]
 8: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x555eb1b0d50a]
 9: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x555eb16511d0]
 10: (non-virtual thunk to PrimaryLogPG::queue_transaction(ceph::os::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x555eb179433f]
 11: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xc10) [0x555eb1964270]
 12: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x156) [0x555eb1976d36]
 13: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x555eb17cee77]
 14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x555eb1772b2d]
 15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x555eb15f694b]
 16: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x555eb1851de7]
 17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x555eb16141a9]
 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555eb1c679dc]
 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555eb1c6ae40]
 20: (()+0x76db) [0x7efecede06db]
 21: (clone()+0x3f) [0x7efecdb8061f]

     0> 2023-02-15T18:46:26.593+0000 7efeaf07d700 -1 *** Caught signal (Aborted) **
 in thread 7efeaf07d700 thread_name:tp_osd_tp

 ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
 1: (()+0x12980) [0x7efecedeb980]
 2: (gsignal()+0xc7) [0x7efecda9de87]
 3: (abort()+0x141) [0x7efecda9f7f1]
 4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1b2) [0x555eb15870cf]
 5: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x62c) [0x555eb1a941ac]
 6: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb05) [0x555eb1b01f45]
 7: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x1ec) [0x555eb1b0388c]
 8: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xd8) [0x555eb1b04338]
 9: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x85) [0x555eb1b05f95]
 10: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x11b1) [0x555eb1b0b4f1]
 11: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x555eb1b0d50a]
 12: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x555eb16511d0]
 13: (non-virtual thunk to PrimaryLogPG::queue_transaction(ceph::os::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x555eb179433f]
 14: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xc10) [0x555eb1964270]
 15: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x156) [0x555eb1976d36]
 16: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x555eb17cee77]
 17: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x555eb1772b2d]
 18: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x555eb15f694b]
 19: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x555eb1851de7]
 20: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x555eb16141a9]
 21: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555eb1c679dc]
 22: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555eb1c6ae40]
 23: (()+0x76db) [0x7efecede06db]
 24: (clone()+0x3f) [0x7efecdb8061f]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

--- logging levels ---
   0/ 5 none
   0/ 1 lockdep
   0/ 1 context
   1/ 1 crush
   1/ 5 mds
   1/ 5 mds_balancer
   1/ 5 mds_locker
   1/ 5 mds_log
   1/ 5 mds_log_expire
   1/ 5 mds_migrator
   0/ 1 buffer
   0/ 1 timer
   0/ 1 filer
   0/ 1 striper
   0/ 1 objecter
   0/ 5 rados
   0/ 5 rbd
   0/ 5 rbd_mirror
   0/ 5 rbd_replay
   0/ 5 rbd_rwl
   0/ 5 journaler
   0/ 5 objectcacher
   0/ 5 immutable_obj_cache
   0/ 5 client
   1/ 5 osd
   0/ 5 optracker
   0/ 5 objclass
   1/ 3 filestore
   1/ 3 journal
   0/ 0 ms
   1/ 5 mon
   0/10 monc
   1/ 5 paxos
   0/ 5 tp
   1/ 5 auth
   1/ 5 crypto
   1/ 1 finisher
   1/ 1 reserver
   1/ 5 heartbeatmap
   1/ 5 perfcounter
   1/ 5 rgw
   1/ 5 rgw_sync
   1/10 civetweb
   1/ 5 javaclient
   1/ 5 asok
   1/ 1 throttle
   0/ 0 refs
   1/ 5 compressor
   1/ 5 bluestore
   1/ 5 bluefs
   1/ 3 bdev
   1/ 5 kstore
   4/ 5 rocksdb
   4/ 5 leveldb
   4/ 5 memdb
   1/ 5 fuse
   2/ 5 mgr
   1/ 5 mgrc
   1/ 5 dpdk
   1/ 5 eventtrace
   1/ 5 prioritycache
   0/ 5 test
  -2/-2 (syslog threshold)
  -1/-1 (stderr threshold)
--- pthread ID / name mapping for recent threads ---
  7efea706d700 / osd_srv_heartbt
  7efea786e700 / tp_osd_tp
  7efea806f700 / tp_osd_tp
  7efea8870700 / tp_osd_tp
  7efea9071700 / tp_osd_tp
  7efea9872700 / tp_osd_tp
  7efeaa073700 / tp_osd_tp
  7efeaa874700 / tp_osd_tp
  7efeab075700 / tp_osd_tp
  7efeab876700 / tp_osd_tp
  7efeac077700 / tp_osd_tp
  7efeac878700 / tp_osd_tp
  7efead079700 / tp_osd_tp
  7efead87a700 / tp_osd_tp
  7efeae07b700 / tp_osd_tp
  7efeae87c700 / tp_osd_tp
  7efeaf07d700 / tp_osd_tp
  7efeb808f700 / ms_dispatch
  7efeb9091700 / rocksdb:dump_st
  7efeb9e8d700 / fn_anonymous
  7efebae8f700 / cfin
  7efebc287700 / safe_timer
  7efebd289700 / ms_dispatch
  7efebfe99700 / bstore_mempool
  7efec20a3700 / rocksdb:low1
  7efec50a9700 / fn_anonymous
  7efec68ac700 / safe_timer
  7efec810b700 / safe_timer
  7efec990e700 / admin_socket
  7efeca10f700 / service
  7efeca910700 / msgr-worker-2
  7efecb111700 / msgr-worker-1
  7efecb912700 / msgr-worker-0
  7efed0c8bd80 / ceph-osd
  max_recent     10000
  max_new         1000
  log_file /var/log/ceph/ceph-osd.74.log
--- end dump of recent events ---

Actions #2

Updated by Igor Fedotov about 1 year ago

Could you please share a log for a failing osd startup attempt too?

And fsck report too, please.

You might want to use ceph-post-file tool (see https://docs.ceph.com/en/quincy/man/8/ceph-post-file/) to upload large logs.
Or (my personal preference) any other public storage, e.g google drive...

Actions #3

Updated by Anonymous about 1 year ago

Here is the ceph-osd log: https://drive.google.com/file/d/1VoNbGab9U6qTBPK0tWfr9AWR1RZLmUng/view?usp=share_link

and here is an excerpt of journalctl -u --no-pager:

[...]
Feb 15 18:46:00 ceph-osd05 ceph-osd[631414]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Feb 15 18:46:00 ceph-osd05 systemd[1]: ceph-osd@74.service: Main process exited, code=killed, status=6/ABRT
Feb 15 18:46:00 ceph-osd05 systemd[1]: ceph-osd@74.service: Failed with result 'signal'.
Feb 15 18:46:10 ceph-osd05 systemd[1]: ceph-osd@74.service: Service hold-off time over, scheduling restart.
Feb 15 18:46:10 ceph-osd05 systemd[1]: ceph-osd@74.service: Scheduled restart job, restart counter is at 3.
Feb 15 18:46:10 ceph-osd05 systemd[1]: Stopped Ceph object storage daemon osd.74.
Feb 15 18:46:10 ceph-osd05 systemd[1]: Starting Ceph object storage daemon osd.74...
Feb 15 18:46:10 ceph-osd05 systemd[1]: Started Ceph object storage daemon osd.74.
Feb 15 18:46:20 ceph-osd05 ceph-osd[631689]: 2023-02-15T18:46:20.561+0000 7efed0c8bd80 -1 osd.74 46598 log_to_monitors {default=true}
Feb 15 18:46:20 ceph-osd05 ceph-osd[631689]: 2023-02-15T18:46:20.633+0000 7efec50a9700 -1 osd.74 46598 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]: 2023-02-15T18:46:26.577+0000 7efeaf07d700 -1 bluestore(/var/lib/ceph/osd/ceph-74).collection(13.48s3_head 0x555ebcb010c0) load_shared_blob sbid 0x5e67e0f000 not found at key 0x0000005e67e0f000
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]: /build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::Collection::load_shared_blob(BlueStore::SharedBlobRef)' thread 7efeaf07d700 time 2023-02-15T18:46:26.582829+0000
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]: /build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: 3945: ceph_abort_msg("uh oh, missing shared_blob")
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xe1) [0x555eb1586ffe]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  2: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x62c) [0x555eb1a941ac]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  3: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb05) [0x555eb1b01f45]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  4: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x1ec) [0x555eb1b0388c]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  5: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xd8) [0x555eb1b04338]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  6: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x85) [0x555eb1b05f95]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x11b1) [0x555eb1b0b4f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  8: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x555eb1b0d50a]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  9: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x555eb16511d0]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  10: (non-virtual thunk to PrimaryLogPG::queue_transaction(ceph::os::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x555eb179433f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  11: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xc10) [0x555eb1964270]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  12: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x156) [0x555eb1976d36]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  13: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x555eb17cee77]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x555eb1772b2d]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x555eb15f694b]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  16: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x555eb1851de7]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x555eb16141a9]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555eb1c679dc]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555eb1c6ae40]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  20: (()+0x76db) [0x7efecede06db]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  21: (clone()+0x3f) [0x7efecdb8061f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]: 2023-02-15T18:46:26.585+0000 7efeaf07d700 -1 /build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::Collection::load_shared_blob(BlueStore::SharedBlobRef)' thread 7efeaf07d700 time 2023-02-15T18:46:26.582829+0000
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]: /build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: 3945: ceph_abort_msg("uh oh, missing shared_blob")
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xe1) [0x555eb1586ffe]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  2: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x62c) [0x555eb1a941ac]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  3: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb05) [0x555eb1b01f45]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  4: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x1ec) [0x555eb1b0388c]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  5: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xd8) [0x555eb1b04338]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  6: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x85) [0x555eb1b05f95]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x11b1) [0x555eb1b0b4f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  8: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x555eb1b0d50a]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  9: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x555eb16511d0]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  10: (non-virtual thunk to PrimaryLogPG::queue_transaction(ceph::os::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x555eb179433f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  11: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xc10) [0x555eb1964270]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  12: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x156) [0x555eb1976d36]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  13: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x555eb17cee77]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x555eb1772b2d]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x555eb15f694b]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  16: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x555eb1851de7]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x555eb16141a9]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555eb1c679dc]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555eb1c6ae40]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  20: (()+0x76db) [0x7efecede06db]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  21: (clone()+0x3f) [0x7efecdb8061f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]: *** Caught signal (Aborted) **
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  in thread 7efeaf07d700 thread_name:tp_osd_tp
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  1: (()+0x12980) [0x7efecedeb980]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  2: (gsignal()+0xc7) [0x7efecda9de87]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  3: (abort()+0x141) [0x7efecda9f7f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1b2) [0x555eb15870cf]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  5: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x62c) [0x555eb1a941ac]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  6: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb05) [0x555eb1b01f45]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  7: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x1ec) [0x555eb1b0388c]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  8: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xd8) [0x555eb1b04338]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  9: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x85) [0x555eb1b05f95]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  10: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x11b1) [0x555eb1b0b4f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  11: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x555eb1b0d50a]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  12: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x555eb16511d0]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  13: (non-virtual thunk to PrimaryLogPG::queue_transaction(ceph::os::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x555eb179433f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  14: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xc10) [0x555eb1964270]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  15: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x156) [0x555eb1976d36]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  16: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x555eb17cee77]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  17: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x555eb1772b2d]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  18: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x555eb15f694b]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  19: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x555eb1851de7]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  20: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x555eb16141a9]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  21: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555eb1c679dc]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  22: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555eb1c6ae40]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  23: (()+0x76db) [0x7efecede06db]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  24: (clone()+0x3f) [0x7efecdb8061f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]: 2023-02-15T18:46:26.593+0000 7efeaf07d700 -1 *** Caught signal (Aborted) **
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  in thread 7efeaf07d700 thread_name:tp_osd_tp
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  1: (()+0x12980) [0x7efecedeb980]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  2: (gsignal()+0xc7) [0x7efecda9de87]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  3: (abort()+0x141) [0x7efecda9f7f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1b2) [0x555eb15870cf]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  5: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x62c) [0x555eb1a941ac]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  6: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb05) [0x555eb1b01f45]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  7: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x1ec) [0x555eb1b0388c]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  8: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xd8) [0x555eb1b04338]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  9: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x85) [0x555eb1b05f95]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  10: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x11b1) [0x555eb1b0b4f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  11: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x555eb1b0d50a]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  12: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x555eb16511d0]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  13: (non-virtual thunk to PrimaryLogPG::queue_transaction(ceph::os::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x555eb179433f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  14: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xc10) [0x555eb1964270]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  15: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x156) [0x555eb1976d36]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  16: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x555eb17cee77]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  17: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x555eb1772b2d]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  18: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x555eb15f694b]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  19: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x555eb1851de7]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  20: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x555eb16141a9]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  21: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555eb1c679dc]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  22: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555eb1c6ae40]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  23: (()+0x76db) [0x7efecede06db]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  24: (clone()+0x3f) [0x7efecdb8061f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  -2280> 2023-02-15T18:46:20.561+0000 7efed0c8bd80 -1 osd.74 46598 log_to_monitors {default=true}
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  -1878> 2023-02-15T18:46:20.633+0000 7efec50a9700 -1 osd.74 46598 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:     -2> 2023-02-15T18:46:26.577+0000 7efeaf07d700 -1 bluestore(/var/lib/ceph/osd/ceph-74).collection(13.48s3_head 0x555ebcb010c0) load_shared_blob sbid 0x5e67e0f000 not found at key 0x0000005e67e0f000
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:     -1> 2023-02-15T18:46:26.585+0000 7efeaf07d700 -1 /build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::Collection::load_shared_blob(BlueStore::SharedBlobRef)' thread 7efeaf07d700 time 2023-02-15T18:46:26.582829+0000
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]: /build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: 3945: ceph_abort_msg("uh oh, missing shared_blob")
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xe1) [0x555eb1586ffe]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  2: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x62c) [0x555eb1a941ac]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  3: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb05) [0x555eb1b01f45]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  4: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x1ec) [0x555eb1b0388c]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  5: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xd8) [0x555eb1b04338]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  6: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x85) [0x555eb1b05f95]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x11b1) [0x555eb1b0b4f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  8: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x555eb1b0d50a]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  9: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x555eb16511d0]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  10: (non-virtual thunk to PrimaryLogPG::queue_transaction(ceph::os::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x555eb179433f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  11: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xc10) [0x555eb1964270]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  12: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x156) [0x555eb1976d36]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  13: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x555eb17cee77]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x555eb1772b2d]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x555eb15f694b]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  16: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x555eb1851de7]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x555eb16141a9]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555eb1c679dc]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555eb1c6ae40]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  20: (()+0x76db) [0x7efecede06db]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  21: (clone()+0x3f) [0x7efecdb8061f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:      0> 2023-02-15T18:46:26.593+0000 7efeaf07d700 -1 *** Caught signal (Aborted) **
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  in thread 7efeaf07d700 thread_name:tp_osd_tp
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  1: (()+0x12980) [0x7efecedeb980]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  2: (gsignal()+0xc7) [0x7efecda9de87]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  3: (abort()+0x141) [0x7efecda9f7f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1b2) [0x555eb15870cf]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  5: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x62c) [0x555eb1a941ac]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  6: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb05) [0x555eb1b01f45]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  7: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x1ec) [0x555eb1b0388c]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  8: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xd8) [0x555eb1b04338]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  9: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x85) [0x555eb1b05f95]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  10: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x11b1) [0x555eb1b0b4f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  11: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x555eb1b0d50a]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  12: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x555eb16511d0]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  13: (non-virtual thunk to PrimaryLogPG::queue_transaction(ceph::os::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x555eb179433f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  14: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xc10) [0x555eb1964270]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  15: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x156) [0x555eb1976d36]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  16: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x555eb17cee77]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  17: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x555eb1772b2d]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  18: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x555eb15f694b]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  19: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x555eb1851de7]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  20: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x555eb16141a9]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  21: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555eb1c679dc]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  22: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555eb1c6ae40]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  23: (()+0x76db) [0x7efecede06db]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  24: (clone()+0x3f) [0x7efecdb8061f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  -2280> 2023-02-15T18:46:20.561+0000 7efed0c8bd80 -1 osd.74 46598 log_to_monitors {default=true}
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  -1878> 2023-02-15T18:46:20.633+0000 7efec50a9700 -1 osd.74 46598 set_numa_affinity unable to identify public interface '' numa node: (2) No such file or directory
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:     -2> 2023-02-15T18:46:26.577+0000 7efeaf07d700 -1 bluestore(/var/lib/ceph/osd/ceph-74).collection(13.48s3_head 0x555ebcb010c0) load_shared_blob sbid 0x5e67e0f000 not found at key 0x0000005e67e0f000
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:     -1> 2023-02-15T18:46:26.585+0000 7efeaf07d700 -1 /build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::Collection::load_shared_blob(BlueStore::SharedBlobRef)' thread 7efeaf07d700 time 2023-02-15T18:46:26.582829+0000
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]: /build/ceph-15.2.17/src/os/bluestore/BlueStore.cc: 3945: ceph_abort_msg("uh oh, missing shared_blob")
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  1: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0xe1) [0x555eb1586ffe]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  2: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x62c) [0x555eb1a941ac]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  3: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb05) [0x555eb1b01f45]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  4: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x1ec) [0x555eb1b0388c]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  5: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xd8) [0x555eb1b04338]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  6: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x85) [0x555eb1b05f95]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x11b1) [0x555eb1b0b4f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  8: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x555eb1b0d50a]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  9: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x555eb16511d0]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  10: (non-virtual thunk to PrimaryLogPG::queue_transaction(ceph::os::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x555eb179433f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  11: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xc10) [0x555eb1964270]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  12: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x156) [0x555eb1976d36]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  13: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x555eb17cee77]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x555eb1772b2d]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x555eb15f694b]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  16: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x555eb1851de7]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x555eb16141a9]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555eb1c679dc]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555eb1c6ae40]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  20: (()+0x76db) [0x7efecede06db]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  21: (clone()+0x3f) [0x7efecdb8061f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:      0> 2023-02-15T18:46:26.593+0000 7efeaf07d700 -1 *** Caught signal (Aborted) **
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  in thread 7efeaf07d700 thread_name:tp_osd_tp
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  ceph version 15.2.17 (8a82819d84cf884bd39c17e3236e0632ac146dc4) octopus (stable)
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  1: (()+0x12980) [0x7efecedeb980]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  2: (gsignal()+0xc7) [0x7efecda9de87]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  3: (abort()+0x141) [0x7efecda9f7f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  4: (ceph::__ceph_abort(char const*, int, char const*, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)+0x1b2) [0x555eb15870cf]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  5: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x62c) [0x555eb1a941ac]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  6: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb05) [0x555eb1b01f45]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  7: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x1ec) [0x555eb1b0388c]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  8: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xd8) [0x555eb1b04338]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  9: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x85) [0x555eb1b05f95]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  10: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ceph::os::Transaction*)+0x11b1) [0x555eb1b0b4f1]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  11: (BlueStore::queue_transactions(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, std::vector<ceph::os::Transaction, std::allocator<ceph::os::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x2aa) [0x555eb1b0d50a]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  12: (ObjectStore::queue_transaction(boost::intrusive_ptr<ObjectStore::CollectionImpl>&, ceph::os::Transaction&&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x80) [0x555eb16511d0]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  13: (non-virtual thunk to PrimaryLogPG::queue_transaction(ceph::os::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x4f) [0x555eb179433f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  14: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xc10) [0x555eb1964270]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  15: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x156) [0x555eb1976d36]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  16: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x97) [0x555eb17cee77]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  17: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x6fd) [0x555eb1772b2d]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  18: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x17b) [0x555eb15f694b]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  19: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x555eb1851de7]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  20: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x8a9) [0x555eb16141a9]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  21: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x4ac) [0x555eb1c679dc]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  22: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x555eb1c6ae40]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  23: (()+0x76db) [0x7efecede06db]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  24: (clone()+0x3f) [0x7efecdb8061f]
Feb 15 18:46:26 ceph-osd05 ceph-osd[631689]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Feb 15 18:46:26 ceph-osd05 systemd[1]: ceph-osd@74.service: Main process exited, code=killed, status=6/ABRT
Feb 15 18:46:26 ceph-osd05 systemd[1]: ceph-osd@74.service: Failed with result 'signal'.
Feb 15 18:46:37 ceph-osd05 systemd[1]: ceph-osd@74.service: Service hold-off time over, scheduling restart.
Feb 15 18:46:37 ceph-osd05 systemd[1]: ceph-osd@74.service: Scheduled restart job, restart counter is at 4.
Feb 15 18:46:37 ceph-osd05 systemd[1]: Stopped Ceph object storage daemon osd.74.
Feb 15 18:46:37 ceph-osd05 systemd[1]: ceph-osd@74.service: Start request repeated too quickly.
Feb 15 18:46:37 ceph-osd05 systemd[1]: ceph-osd@74.service: Failed with result 'signal'.
Feb 15 18:46:37 ceph-osd05 systemd[1]: Failed to start Ceph object storage daemon osd.74.

as far as I can see there are no other errors, like kernel/dmesg stuff, but I'm still investigating.

Actions #4

Updated by Anonymous about 1 year ago

fsck:

root@ceph-osd05:~# ceph-bluestore-tool --path /var/lib/ceph/osd/ceph-74/ fsck
2023-02-16T14:02:59.136+0000 7f75a36bc0c0 -1 bluestore(/var/lib/ceph/osd/ceph-74) fsck error: 3#13:134aaace:::rbd_data.10.88e7b91cdf9bd9.00000000000010cb:head# blob blob([!~6000,0x5e67e29000~3000,!~1000] csum+shared crc32c/0x1000) sbid 405469720576 > blobid_max 383604436
2023-02-16T14:03:11.236+0000 7f75a36bc0c0 -1 bluestore(/var/lib/ceph/osd/ceph-74) fsck error: shared blob references aren't matching, at least 12 found
fsck status: remaining 13 error(s) and warning(s)
Actions #5

Updated by Anonymous about 1 year ago

deep fsck returned the same:

root@ceph-osd05:~# ceph-bluestore-tool --deep on --path /var/lib/ceph/osd/ceph-74/ fsck 
2023-02-16T15:19:57.023+0000 7f7acc4e20c0 -1 bluestore(/var/lib/ceph/osd/ceph-74) fsck error: 3#13:134aaace:::rbd_data.10.88e7b91cdf9bd9.00000000000010cb:head# blob blob([!~6000,0x5e67e29000~3000,!~1000] csum+shared crc32c/0x1000) sbid 405469720576 > blobid_max 383614676
2023-02-16T15:23:59.718+0000 7f7acc4e20c0 -1 bluestore(/var/lib/ceph/osd/ceph-74) fsck error: shared blob references aren't matching, at least 12 found
fsck status: remaining 13 error(s) and warning(s)
Actions #6

Updated by Anonymous about 1 year ago

If you need anything else, I'm happy to assist in debugging.

Actions #7

Updated by Anonymous about 1 year ago

I tried "repair" instead of fsck, here are the results:

root@ceph-osd05:~# ceph-bluestore-tool --deep on --path /var/lib/ceph/osd/ceph-74/ repair
2023-02-17T16:24:13.607+0000 7ff4ff43c0c0 -1 bluestore(/var/lib/ceph/osd/ceph-74) fsck error: 3#13:134aaace:::rbd_data.10.88e7b91cdf9bd9.00000000000010cb:head# blob blob([!~6000,0x5e67e29000~3000,!~1000] csum+shared crc32c/0x1000) sbid 405469720576 > blobid_max 383614676
2023-02-17T16:28:17.030+0000 7ff4ff43c0c0 -1 bluestore(/var/lib/ceph/osd/ceph-74) fsck error: shared blob references aren't matching, at least 12 found
repair status: remaining 1 error(s) and warning(s)
Actions #8

Updated by Anonymous about 1 year ago

I now started the osd again, and at least it did not crash - yet:

Feb 21 10:05:50 ceph-osd05 systemd[1]: Starting Ceph object storage daemon osd.74...
Feb 21 10:05:50 ceph-osd05 systemd[1]: Started Ceph object storage daemon osd.74.
Feb 21 10:05:58 ceph-osd05 ceph-osd[1669781]: 2023-02-21T10:05:58.988+0000 7f0456a62d80 -1 osd.74 46606 log_to_monitors {default=true}
Feb 21 10:05:59 ceph-osd05 ceph-osd[1669781]: 2023-02-21T10:05:59.952+0000 7f0443060700 -1 osd.74 46606 failed to load OSD map for epoch 48804, got 0 bytes
Feb 21 10:05:59 ceph-osd05 ceph-osd[1669781]: 2023-02-21T10:05:59.980+0000 7f0434e54700 -1 osd.74 48805 failed to load OSD map for epoch 46607, got 0 bytes
Feb 21 10:05:59 ceph-osd05 ceph-osd[1669781]: 2023-02-21T10:05:59.980+0000 7f0434e54700 -1 osd.74 48805 failed to load OSD map for epoch 46608, got 0 bytes
[...]

the "failed to load OSD map for epoch" message repeated a lot, until epoch 47649 and then nothing is logged anymore.

the cluster is now rebalancing:

ceph -s 
[...]
  progress:
    Rebalancing after osd.74 marked in (4m)
      [===================.........] (remaining: 106s)

Actions #9

Updated by Anonymous about 1 year ago

I did spot these lines in dmesg, but don't know if they are related, as the date does not match the osd crash.

I include them just in case:

[Sat Dec 24 16:24:50 2022] traps: bstore_kv_final[1317887] general protection fault ip:7f95d10bb74d sp:7f95c1e34500 error:0 in libtcmalloc.so.4.3.0[7f95d1086000+45000]
[Sat Jan  7 05:03:43 2023] traps: tp_osd_tp[1320880] general protection fault ip:7fe7d95a60ad sp:7fe7b7501830 error:0 in libtcmalloc.so.4.3.0[7fe7d9573000+45000]
[Tue Feb  7 03:50:52 2023] tp_osd_tp[1319617]: segfault at 400000000 ip 0000556db7465ed0 sp 00007f355f8f4e18 error 6 in ceph-osd[556db654d000+1e04000]
[Tue Feb  7 03:50:52 2023] Code: c5 e8 94 7e f3 ff 48 83 3d dc 68 13 01 00 74 0a 48 8b 7c 24 28 e8 f0 9c a4 ff 48 89 ef e8 58 8e a4 ff 48 89 c5 eb df 0f 1f 00 <f0> 83 2f 01 74 0a f3 c3 0f 1f 84 00 00 00 00 00 55 53 48 89 fb 48

Actions #10

Updated by Anonymous about 1 year ago

the affected osd started fine after bluestore repair.

anything else you need?

Actions #11

Updated by Igor Fedotov about 1 year ago

Hi Sven,
thanks a lot for all the info. Unfortunately it looks like the actual corruption happened during the first crash (or even before) and later logs/dumps can hardly be helpful for root cause investigation..
There are some chances the original crash is caused by the same bug as the one from https://tracker.ceph.com/issues/56382 though. The backtraces are pretty similar but I've never heard about resulting on-disk data corruptions due to it.

And as far as I understand that's a single-shot issue for you so far, right?
Anyway haven't you observed more OSD crashes which could remind #56382 but didn't cause persistent failures?

Actions #12

Updated by Anonymous about 1 year ago

And as far as I understand that's a single-shot issue for you so far, right?

yes

Anyway haven't you observed more OSD crashes which could remind #56382 but didn't cause persistent failures?

yes, we have lots of those, see: https://tracker.ceph.com/issues/58439

Actions #13

Updated by Igor Fedotov about 1 year ago

ah, ok. Chances this ticket is related much higher then. So my recommendation would be to upgrade to Quincy once the next minor release is out.

Actions #14

Updated by Adam Kupczyk about 1 year ago

  • Status changed from New to Need More Info
Actions

Also available in: Atom PDF