Actions
Bug #37090
closedBlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob")
Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Target version:
-
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Possibly a duplicate of #36303
What is slightly interesting, after setting the osd out and migrating off of it, it still claims to have one pg.
{ "cluster_fsid": "8385cdd7-977d-41eb-9810-c1720580fe1b", "osd_fsid": "722f4440-13d2-4b18-93e7-b644e01e4ca8", "whoami": 901, "state": "active", "oldest_map": 1607919, "newest_map": 1608517, "num_pgs": 1 } <pre> The clustermap on the other hand seems to think it does not hold any PGs (ceph pg ls-by-osd 901 or for that matter ceph pg map 34.958). Going by dump_pgstate_history this is the (shard? of) the PG it failed the assert on. <pre> { "pg": "34.958s2", "history": [ { "epoch": "1608511", "state": "Initial", "enter": "2018-11-13 11:22:43.986455", "exit": "2018-11-13 11:22:47.022097", "state": "Reset", "enter": "2018-11-13 11:22:47.022098", "exit": "2018-11-13 11:22:47.028721" }, { "epoch": "1608512", "state": "Start", "enter": "2018-11-13 11:22:47.028729", "exit": "2018-11-13 11:22:47.028748", "state": "Started/Stray", "enter": "2018-11-13 11:22:47.028748", "exit": "2018-11-13 11:22:48.195070", "state": "Started", "enter": "2018-11-13 11:22:47.028722", "exit": "2018-11-13 11:22:48.195079", "state": "Reset", "enter": "2018-11-13 11:22:48.195080", "exit": "2018-11-13 11:22:48.195466" }, { "epoch": "1608514", "state": "Start", "enter": "2018-11-13 11:22:48.195474", "exit": "2018-11-13 11:22:48.195494" } ] } <pre> <pre> 2018-11-12 13:17:54.768375 7fb31a937700 -1 bluestore(/var/lib/ceph/osd/ceph-901).collection(34.958s2_head 0x5569008c3000) load_shared_blob sbid 0x5579b888ea80 not found at key 0x00005579b888ea80 2018-11-12 13:17:54.773521 7fb31a937700 -1 /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: In function 'void BlueStore::Collection::load_shared_blob(BlueStore::SharedBlobRef)' thread 7fb31a937700 time 2018-11-12 13:17:54.768383 /build/ceph-12.2.9/src/os/bluestore/BlueStore.cc: 3099: FAILED assert(0 == "uh oh, missing shared_blob") ceph version 12.2.9 (9e300932ef8a8916fb3fda78c58691a6ab0f4217) luminous (stable) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x102) [0x5568ebda53b2] 2: (BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x3fb) [0x5568ebbfc1eb] 3: (BlueStore::_wctx_finish(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x10a3) [0x5568ebc2d0f3] 4: (BlueStore::_do_truncate(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>, unsigned long, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0x2e2) [0x5568ebc47682] 5: (BlueStore::_do_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>)+0xc5) [0x5568ebc47e75] 6: (BlueStore::_remove(BlueStore::TransContext*, boost::intrusive_ptr<BlueStore::Collection>&, boost::intrusive_ptr<BlueStore::Onode>&)+0x7b) [0x5568ebc4995b] 7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, ObjectStore::Transaction*)+0x1bbc) [0x5568ebc56e2c] 8: (BlueStore::queue_transactions(ObjectStore::Sequencer*, std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x52e) [0x5568ebc57f2e] 9: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, ObjectStore::Transaction&&, Context*, Context*, Context*, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x17c) [0x5568eb7eaadc] 10: (PrimaryLogPG::queue_transaction(ObjectStore::Transaction&&, boost::intrusive_ptr<OpRequest>)+0x58) [0x5568eb986b38] 11: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0xbb7) [0x5568ebac8607] 12: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x25d) [0x5568ebad9efd] 13: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x50) [0x5568eb9b5f00] 14: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x543) [0x5568eb918ec3] 15: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x3a9) [0x5568eb789999] 16: (PGQueueable::RunVis::operator()(boost::intrusive_ptr<OpRequest> const&)+0x57) [0x5568eba3b577] 17: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x1047) [0x5568eb7b7db7] 18: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x884) [0x5568ebdaa1a4] 19: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x5568ebdad1e0] 20: (()+0x76ba) [0x7fb3362846ba] 21: (clone()+0x6d) [0x7fb3352f741d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this. </pre>
Files
Actions