Project

General

Profile

Actions

Bug #37090

closed

BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")

Added by Kjetil Joergensen over 5 years ago. Updated over 5 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Possibly a duplicate of #36303

What is slightly interesting, after setting the osd out and migrating off of it, it still claims to have one pg.

{
    "cluster_fsid": "8385cdd7-977d-41eb-9810-c1720580fe1b",
    "osd_fsid": "722f4440-13d2-4b18-93e7-b644e01e4ca8",
    "whoami": 901,
    "state": "active",
    "oldest_map": 1607919,
    "newest_map": 1608517,
    "num_pgs": 1
}
<pre>

The clustermap on the other hand seems to think it does not hold any PGs (ceph pg ls-by-osd 901 or for that matter ceph pg map 34.958).

Going by dump_pgstate_history this is the (shard? of) the PG it failed the assert on.
<pre>
{
    "pg": "34.958s2",
    "history": [
        {
            "epoch": "1608511",
            "state": "Initial",
            "enter": "2018-11-13 11:22:43.986455",
            "exit": "2018-11-13 11:22:47.022097",
            "state": "Reset",
            "enter": "2018-11-13 11:22:47.022098",
            "exit": "2018-11-13 11:22:47.028721" 
        },
        {
            "epoch": "1608512",
            "state": "Start",
            "enter": "2018-11-13 11:22:47.028729",
            "exit": "2018-11-13 11:22:47.028748",
            "state": "Started/Stray",
            "enter": "2018-11-13 11:22:47.028748",
            "exit": "2018-11-13 11:22:48.195070",
            "state": "Started",
            "enter": "2018-11-13 11:22:47.028722",
            "exit": "2018-11-13 11:22:48.195079",
            "state": "Reset",
            "enter": "2018-11-13 11:22:48.195080",
            "exit": "2018-11-13 11:22:48.195466" 
        },
        {
            "epoch": "1608514",
            "state": "Start",
            "enter": "2018-11-13 11:22:48.195474",
            "exit": "2018-11-13 11:22:48.195494" 
        }
    ]
}
<pre>

<pre>
2018-11-12 13:17:54.768375 7fb31a937700 -1 bluestore(/var/lib/ceph/osd/ceph-901).collection(34.958s2_head 0x5569008c3000) load_shared_blob sbid 0x5579b888ea80 not found at key 0x00005579b888ea80
2018-11-12 13:17:54.773521 7fb31a937700 -1 /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::Collection::load_shared_blob(BlueStore::SharedBlobRef)' thread 7fb31a937700 time 2018-11-12 13:17:54.768383
/build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")

 ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x5568ebda53b2]
 2: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x3fb) [0x5568ebbfc1eb]
 3: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x10a3) [0x5568ebc2d0f3]
 4: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x2e2) [0x5568ebc47682]
 5: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xc5) [0x5568ebc47e75]
 6: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x7b) [0x5568ebc4995b]
 7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x1bbc) [0x5568ebc56e2c]
 8: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x52e) [0x5568ebc57f2e]
 9: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, ObjectStore::Transaction&&, Context*, Context*, Context*, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x17c) [0x5568eb7eaadc]
 10: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x58) [0x5568eb986b38]
 11: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xbb7) [0x5568ebac8607]
 12: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x25d) [0x5568ebad9efd]
 13: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x5568eb9b5f00]
 14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x543) [0x5568eb918ec3]
 15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a9) [0x5568eb789999]
 16: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x5568eba3b577]
 17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x1047) [0x5568eb7b7db7]
 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x884) [0x5568ebdaa1a4]
 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5568ebdad1e0]
 20: (()+0x76ba) [0x7fb3362846ba]
 21: (clone()+0x6d) [0x7fb3352f741d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
</pre>

Files

ceph-osd.901.log.stub.xz (887 KB) ceph-osd.901.log.stub.xz Kjetil Joergensen, 11/13/2018 07:48 PM

Related issues 1 (0 open1 closed)

Related to bluestore - Bug #36526: segv in BlueStore::OldExtent::createResolved10/18/2018

Actions
Actions

Also available in: Atom PDF