Project

General

Profile

Bug #24715

FAILED assert(0 == "put on missing extent (nothing before)")

Added by Radoslaw Zarzynski 6 months ago. Updated 3 months ago.

Status:
Duplicate
Priority:
High
Assignee:
-
Target version:
Start date:
06/29/2018
Due date:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Reported on ceph-users by Dyweni (see http://lists.ceph.com/pipermail/ceph-users-ceph.com/2018-June/027632.html).


After removing roughly 20-some rbd shapshots, one of my OSD's has begun
flapping.

ERROR 1

2018-06-25 06:46:39.132257 a0ce2700 -1 osd.8 pg_epoch: 44738 pg[4.e8( v 
44721'485588 (44697'484015,44721'485588] local-lis/les=44593/44595 
n=2972 ec=9422/9422 lis/c 44593/44593 les/c/f 44595/44595/40729 
44593/44593/44593) [8,7,10] r=0 lpr=44593 crt=44721'485588 lcod 
44721'485586 mlcod 44721'485586 active+clean+snapt
rim snaptrimq=[276~1,280~1,2af~1,2e8~4]] removing snap head
2018-06-25 06:46:41.314172 a1ce2700 -1 
/var/tmp/portage/sys-cluster/ceph-12.2.5/work/ceph-12.2.5/src/os/bluestore/bluestore_types.cc: 
In function 'void bluestore_extent_ref_map_t::put(uint64_t, uint32_t, 
PExtentVector*, bool*)' thread a1ce2700 time 2018-06-25 06:46:41.220388
/var/tmp/portage/sys-cluster/ceph-12.2.5/work/ceph-12.2.5/src/os/bluestore/bluestore_types.cc: 
217: FAILED assert(0 == "put on missing extent (nothing before)")

  ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous 
(stable)
  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x1bc) [0x2a2c314]
  2: (bluestore_extent_ref_map_t::put(unsigned long long, unsigned int, 
std::vector<bluestore_pextent_t, 
mempool::pool_allocator<(mempool::pool_index_t)4, bluestore_pextent_t> 
 >*, bool*)+0x128) [0x2893650]
  3: (BlueStore::SharedBlob::put_ref(unsigned long long, unsigned int, 
std::vector<bluestore_pextent_t, 
mempool::pool_allocator<(mempool::pool_index_t)4, bluestore_pextent_t> 
 >*, std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, std::allocator<BlueStore::SharedBlob*> >*)+0xb8) [0x2791bdc]
  4: (BlueStore::_wctx_finish(BlueStore::TransContext*, 
boost::intrusive_ptr<BlueStore::Collection>&, 
boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, 
std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, 
std::allocator<BlueStore::SharedBlob*> >*)+0x5c8) [0x27f3254]
  5: (BlueStore::_do_truncate(BlueStore::TransContext*, 
boost::intrusive_ptr<BlueStore::Collection>&, 
boost::intrusive_ptr<BlueStore::Onode>, unsigned long long, 
std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, 
std::allocator<BlueStore::SharedBlob*> >*)+0x360) [0x27f7834]
  6: (BlueStore::_do_remove(BlueStore::TransContext*, 
boost::intrusive_ptr<BlueStore::Collection>&, 
boost::intrusive_ptr<BlueStore::Onode>)+0xb4) [0x27f81b4]
  7: (BlueStore::_remove(BlueStore::TransContext*, 
boost::intrusive_ptr<BlueStore::Collection>&, 
boost::intrusive_ptr<BlueStore::Onode>&)+0x1dc) [0x27f9638]
  8: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, 
ObjectStore::Transaction*)+0xe7c) [0x27e855c]
  9: (BlueStore::queue_transactions(ObjectStore::Sequencer*, 
std::vector<ObjectStore::Transaction, 
std::allocator<ObjectStore::Transaction> >&, 
boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x67c) 
[0x27e6f80]
  10: (ObjectStore::queue_transactions(ObjectStore::Sequencer*, 
std::vector<ObjectStore::Transaction, 
std::allocator<ObjectStore::Transaction> >&, Context*, Context*, 
Context*, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x118) 
[0x1f9ce48]
  11: 
(PrimaryLogPG::queue_transactions(std::vector<ObjectStore::Transaction, 
std::allocator<ObjectStore::Transaction> >&, 
boost::intrusive_ptr<OpRequest>)+0x9c) [0x22dd754]
  12: (ReplicatedBackend::submit_transaction(hobject_t const&, 
object_stat_sum_t const&, eversion_t const&, 
std::unique_ptr<PGTransaction, std::default_delete<PGTransaction> >&&, 
eversion_t const&, eversion_t const&, std::vector<pg_log_entry_t, 
std::allocator<pg_log_entry_t> > const&, 
boost::optional<pg_hit_set_histo
ry_t>&, Context*, Context*, Context*, unsigned long long, osd_reqid_t, 
boost::intrusive_ptr<OpRequest>)+0x6f4) [0x25c0568]
  13: (PrimaryLogPG::issue_repop(PrimaryLogPG::RepGather*, 
PrimaryLogPG::OpContext*)+0x7f4) [0x228ac98]
  14: 
(PrimaryLogPG::simple_opc_submit(std::unique_ptr<PrimaryLogPG::OpContext, 
std::default_delete<PrimaryLogPG::OpContext> >)+0x1b8) [0x228bc54]
  15: (PrimaryLogPG::AwaitAsyncWork::react(PrimaryLogPG::DoSnapWork 
const&)+0x1970) [0x22c5d4c]
  16: (boost::statechart::detail::reaction_result 
boost::statechart::custom_reaction<PrimaryLogPG::DoSnapWork>::react<PrimaryLogPG::AwaitAsyncWork, 
boost::statechart::event_base, void 
const*>(PrimaryLogPG::AwaitAsyncWork&, boost::statechart::event_base 
const&, void const* const&)+0x58) [0x23b245c]
  17: (boost::statechart::detail::reaction_result 
boost::statechart::simple_state<PrimaryLogPG::AwaitAsyncWork, 
PrimaryLogPG::Trimming, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_:
:na, mpl_::na, mpl_::na, mpl_::na>, 
(boost::statechart::history_mode)0>::local_react_impl_non_empty::local_react_impl<boost::mpl::list<boost::statechart::custom_reaction<PrimaryLogPG::DoSnapWork>, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na,
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, 
boost::statechart::simple_state<PrimaryLogPG::AwaitAsyncWork, 
PrimaryLogPG::Trimming, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl
_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na>, 
(boost::statechart::history_mode)0> 
 >(boost::statechart::simple_state<PrimaryLogPG::AwaitAsyncWork, PrimaryLogPG::Trimming, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na
, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na>, (boost::statechart::history_mode)0>&, 
boost::statechart::event_base const&, void const*)+0x30) [0x23b0f04]
  18: (boost::statechart::detail::reaction_result 
boost::statechart::simple_state<PrimaryLogPG::AwaitAsyncWork, 
PrimaryLogPG::Trimming, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_:
:na, mpl_::na, mpl_::na, mpl_::na>, 
(boost::statechart::history_mode)0>::local_react<boost::mpl::list<boost::statechart::custom_reaction<PrimaryLogPG::DoSnapWork>, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl
_::na, mpl_::na, mpl_::na, mpl_::na> >(boost::statechart::event_base 
const&, void const*)+0x28) [0x23af7cc]
  19: (boost::statechart::simple_state<PrimaryLogPG::AwaitAsyncWork, 
PrimaryLogPG::Trimming, boost::mpl::list<mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, mpl_::na, 
mpl_::na, mpl_::na, mpl_::na>, (boost:
:statechart::history_mode)0>::react_impl(boost::statechart::event_base 
const&, void const*)+0x28) [0x23ad744]
  20: 
(boost::statechart::detail::send_function<boost::statechart::detail::state_base<std::allocator<void>, 
boost::statechart::detail::rtti_policy>, boost::statechart::event_base, 
void const*>::operator()()+0x40) [0x21b6000]
  21: (boost::statechart::detail::reaction_result 
boost::statechart::null_exception_translator::operator()<boost::statechart::detail::send_function<boost::statechart::detail::state_base<std::allocator<void>, 
boost::statechart::detail::rtti_policy>, boost::statechart::event_base, 
void const*>, boost::statechart::state
_machine<PrimaryLogPG::SnapTrimmer, PrimaryLogPG::NotTrimming, 
std::allocator<void>, 
boost::statechart::null_exception_translator>::exception_event_handler>(boost::statechart::detail::send_function<boost::statechart::detail::state_base<std::allocator<void>, 
boost::statechart::detail::rtti_policy>, boost::statechart:
:event_base, void const*>, 
boost::statechart::state_machine<PrimaryLogPG::SnapTrimmer, 
PrimaryLogPG::NotTrimming, std::allocator<void>, 
boost::statechart::null_exception_translator>::exception_event_handler)+0x24) 
[0x233c4a4]
  22: (boost::statechart::state_machine<PrimaryLogPG::SnapTrimmer, 
PrimaryLogPG::NotTrimming, std::allocator<void>, 
boost::statechart::null_exception_translator>::send_event(boost::statechart::event_base 
const&)+0x13c) [0x2316328]
  23: (boost::statechart::state_machine<PrimaryLogPG::SnapTrimmer, 
PrimaryLogPG::NotTrimming, std::allocator<void>, 
boost::statechart::null_exception_translator>::process_event(boost::statechart::event_base 
const&)+0x38) [0x22f6404]
  24: (PrimaryLogPG::snap_trimmer(unsigned int)+0x1a8) [0x2258bb0]
  25: (PGQueueable::RunVis::operator()(PGSnapTrim const&)+0x40) 
[0x2520cb4]
  26: (boost::disable_if_c<(false)&&boost::is_same<PGSnapTrim&, 
PGSnapTrim&>::value, void>::type 
boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::internal_visit<PGSnapTrim&>(PGSnapTrim&, int)+0x2c) [0x209f6bc]
  27: (boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::result_type 
boost::detail::variant::visitation_impl_invoke_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>, void*, PGSnapTrim>(int, 
boost::detail::variant::invoke_visitor<PGQueueable::RunVis, false>&, 
void*, PGSnapTrim*
, mpl_::bool_<true>)+0x38) [0x2095a74]
  28: (boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::result_type 
boost::detail::variant::visitation_impl_invoke<boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>, void*, PGSnapTrim, 
boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, 
PGRecovery>::has_fallback_ty
pe_>(int, boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>&, void*, PGSnapTrim*, 
boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, 
PGRecovery>::has_fallback_type_, int)+0x4c) [0x2086f30]
  29: (boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::result_type 
boost::detail::variant::visitation_impl<mpl_::int_<0>, 
boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<boost::mpl::l_item<mpl_::long_<4l>, 
boost::intrusive_ptr<OpRequest>, boost::mpl::l_item<mpl_::long_<3l>, 
PGSnapT
rim, boost::mpl::l_item<mpl_::long_<2l>, PGScrub, 
boost::mpl::l_item<mpl_::long_<1l>, PGRecovery, boost::mpl::l_end> > > > 
 >, boost::mpl::l_iter<boost::mpl::l_end> >, boost::detail::variant::invoke_visitor<PGQueueable::RunVis, false>, void*, boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, PGReco
very>::has_fallback_type_>(int, int, 
boost::detail::variant::invoke_visitor<PGQueueable::RunVis, false>&, 
void*, mpl_::bool_<false>, 
boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, 
PGRecovery>::has_fallback_type_, mpl_::int_<0>*, 
boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<bo
ost::mpl::l_item<mpl_::long_<4l>, boost::intrusive_ptr<OpRequest>, 
boost::mpl::l_item<mpl_::long_<3l>, PGSnapTrim, 
boost::mpl::l_item<mpl_::long_<2l>, PGScrub, 
boost::mpl::l_item<mpl_::long_<1l>, PGRecovery, boost::mpl::l_end> > > > 
 >, boost::mpl::l_iter<boost::mpl::l_end> >*)+0xf0) [0x20710fc]
  30: (boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::result_type boost::variant<boost::intrusive_ptr<OpRequest>, 
PGSnapTrim, PGScrub, 
PGRecovery>::internal_apply_visitor_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>, void*>(int, int, boost::detail::variant::invoke_visit
or<PGQueueable::RunVis, false>&, void*)+0x60) [0x20513e8]
  31: (boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::result_type boost::variant<boost::intrusive_ptr<OpRequest>, 
PGSnapTrim, PGScrub, 
PGRecovery>::internal_apply_visitor<boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false> >(boost::detail::variant::invoke_visitor<PGQueueable::RunVi
s, false>&)+0x4c) [0x20254c4]
  32: (PGQueueable::RunVis::result_type 
boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, 
PGRecovery>::apply_visitor<PGQueueable::RunVis>(PGQueueable::RunVis&) 
&+0x4c) [0x1ff7798]
  33: (PGQueueable::RunVis::result_type 
boost::apply_visitor<PGQueueable::RunVis, 
boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, 
PGRecovery>&>(PGQueueable::RunVis&, 
boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, 
PGRecovery>&)+0x2c) [0x1fcbb70]
  34: (PGQueueable::run(OSD*, boost::intrusive_ptr<PG>&, 
ThreadPool::TPHandle&)+0x5c) [0x1fa76c8]
  35: (OSD::ShardedOpWQ::_process(unsigned int, 
ceph::heartbeat_handle_d*)+0x23e4) [0x1f7f7bc]
  36: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x508) 
[0x2a30f2c]
  37: (ShardedThreadPool::WorkThreadSharded::entry()+0x2c) [0x2a32a68]
  38: (Thread::entry_wrapper()+0xf4) [0x2c39e34]
  39: (Thread::_entry_func(void*)+0x18) [0x2c39d28]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
needed to interpret this.

ERROR 2 (repeats each time the OSD tries to restart)

2018-06-25 06:47:54.618917 a1cee700 -1 
bluestore(/var/lib/ceph/osd/ceph-8).collection(4.7d_head 0xad81ce0) 
load_shared_blob sbid 0x52044 not found at key 0x0000000000052044
2018-06-25 06:47:54.689880 a1cee700 -1 
/var/tmp/portage/sys-cluster/ceph-12.2.5/work/ceph-12.2.5/src/os/bluestore/BlueStore.cc: 
In function 'void 
BlueStore::Collection::load_shared_blob(BlueStore::SharedBlobRef)' 
thread a1cee700 time 2018-06-25 06:47:54.619013
/var/tmp/portage/sys-cluster/ceph-12.2.5/work/ceph-12.2.5/src/os/bluestore/BlueStore.cc: 
3158: FAILED assert(0 == "uh oh, missing shared_blob")

  ceph version 12.2.5 (cad919881333ac92274171586c827e01f554a70a) luminous 
(stable)
  1: (ceph::__ceph_assert_fail(char const*, char const*, int, char 
const*)+0x1bc) [0x2a20314]
  2: 
(BlueStore::Collection::load_shared_blob(boost::intrusive_ptr<BlueStore::SharedBlob>)+0x32c) 
[0x27937a8]
  3: (BlueStore::_wctx_finish(BlueStore::TransContext*, 
boost::intrusive_ptr<BlueStore::Collection>&, 
boost::intrusive_ptr<BlueStore::Onode>, BlueStore::WriteContext*, 
std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, 
std::allocator<BlueStore::SharedBlob*> >*)+0x4e0) [0x27e716c]
  4: (BlueStore::_do_truncate(BlueStore::TransContext*, 
boost::intrusive_ptr<BlueStore::Collection>&, 
boost::intrusive_ptr<BlueStore::Onode>, unsigned long long, 
std::set<BlueStore::SharedBlob*, std::less<BlueStore::SharedBlob*>, 
std::allocator<BlueStore::SharedBlob*> >*)+0x360) [0x27eb834]
  5: (BlueStore::_do_remove(BlueStore::TransContext*, 
boost::intrusive_ptr<BlueStore::Collection>&, 
boost::intrusive_ptr<BlueStore::Onode>)+0xb4) [0x27ec1b4]
  6: (BlueStore::_remove(BlueStore::TransContext*, 
boost::intrusive_ptr<BlueStore::Collection>&, 
boost::intrusive_ptr<BlueStore::Onode>&)+0x1dc) [0x27ed638]
  7: (BlueStore::_txc_add_transaction(BlueStore::TransContext*, 
ObjectStore::Transaction*)+0xe7c) [0x27dc55c]
  8: (BlueStore::queue_transactions(ObjectStore::Sequencer*, 
std::vector<ObjectStore::Transaction, 
std::allocator<ObjectStore::Transaction> >&, 
boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x67c) 
[0x27daf80]
  9: (ObjectStore::queue_transactions(ObjectStore::Sequencer*, 
std::vector<ObjectStore::Transaction, 
std::allocator<ObjectStore::Transaction> >&, Context*, Context*, 
Context*, boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0x118) 
[0x1f90e48]
  10: (ObjectStore::queue_transaction(ObjectStore::Sequencer*, 
ObjectStore::Transaction&&, Context*, Context*, Context*, 
boost::intrusive_ptr<TrackedOp>, ThreadPool::TPHandle*)+0xc0) 
[0x1f90cbc]
  11: (PrimaryLogPG::remove_missing_object(hobject_t const&, eversion_t, 
Context*)+0x3b0) [0x228adbc]
  12: (PrimaryLogPG::recover_missing(hobject_t const&, eversion_t, int, 
PGBackend::RecoveryHandle*)+0x348) [0x22899b4]
  13: (PrimaryLogPG::recover_primary(unsigned long long, 
ThreadPool::TPHandle&)+0x1488) [0x2296ba0]
  14: (PrimaryLogPG::start_recovery_ops(unsigned long long, 
ThreadPool::TPHandle&, unsigned long long*)+0x2b8) [0x2294054]
  15: (OSD::do_recovery(PG*, unsigned int, unsigned long long, 
ThreadPool::TPHandle&)+0x574) [0x1f6abe4]
  16: (PGQueueable::RunVis::operator()(PGRecovery const&)+0x60) 
[0x2514d68]
  17: (boost::disable_if_c<(false)&&boost::is_same<PGRecovery&, 
PGRecovery&>::value, void>::type 
boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::internal_visit<PGRecovery&>(PGRecovery&, int)+0x2c) [0x2093724]
  18: (boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::result_type 
boost::detail::variant::visitation_impl_invoke_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>, void*, PGRecovery>(int, 
boost::detail::variant::invoke_visitor<PGQueueable::RunVis, false>&, 
void*, PGRecovery*, mpl_::bool_<true>)+0x38) [0x2089af4]
  19: (boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::result_type 
boost::detail::variant::visitation_impl_invoke<boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>, void*, PGRecovery, 
boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, 
PGRecovery>::has_fallback_type_>(int, 
boost::detail::variant::invoke_visitor<PGQueueable::RunVis, false>&, 
void*, PGRecovery*, boost::variant<boost::intrusive_ptr<OpRequest>, 
PGSnapTrim, PGScrub, PGRecovery>::has_fallback_type_, int)+0x4c) 
[0x207b020]
  20: (boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::result_type 
boost::detail::variant::visitation_impl<mpl_::int_<0>, 
boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<boost::mpl::l_item<mpl_::long_<4l>, 
boost::intrusive_ptr<OpRequest>, boost::mpl::l_item<mpl_::long_<3l>, 
PGSnapTrim, boost::mpl::l_item<mpl_::long_<2l>, PGScrub, 
boost::mpl::l_item<mpl_::long_<1l>, PGRecovery, boost::mpl::l_end> > > > 
 >, boost::mpl::l_iter<boost::mpl::l_end> >, boost::detail::variant::invoke_visitor<PGQueueable::RunVis, false>, void*, boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, PGRecovery>::has_fallback_type_>(int, int, boost::detail::variant::invoke_visitor<PGQueueable::RunVis, false>&, void*, mpl_::bool_<false>, boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, PGRecovery>::has_fallback_type_, mpl_::int_<0>*, boost::detail::variant::visitation_impl_step<boost::mpl::l_iter<boost::mpl::l_item<mpl_::long_<4l>, boost::intrusive_ptr<OpRequest>, boost::mpl::l_item<mpl_::long_<3l>, PGSnapTrim, boost::mpl::l_item<mpl_::long_<2l>, PGScrub, boost::mpl::l_item<mpl_::long_<1l>, PGRecovery, boost::mpl::l_end> > > > >, boost::mpl::l_iter<boost::mpl::l_end> >*)+0x140) [0x206514c]
  21: (boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::result_type boost::variant<boost::intrusive_ptr<OpRequest>, 
PGSnapTrim, PGScrub, 
PGRecovery>::internal_apply_visitor_impl<boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>, void*>(int, int, 
boost::detail::variant::invoke_visitor<PGQueueable::RunVis, false>&, 
void*)+0x60) [0x20453e8]
  22: (boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>::result_type boost::variant<boost::intrusive_ptr<OpRequest>, 
PGSnapTrim, PGScrub, 
PGRecovery>::internal_apply_visitor<boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false> >(boost::detail::variant::invoke_visitor<PGQueueable::RunVis, 
false>&)+0x4c) [0x20194c4]
  23: (PGQueueable::RunVis::result_type 
boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, 
PGRecovery>::apply_visitor<PGQueueable::RunVis>(PGQueueable::RunVis&) 
&+0x4c) [0x1feb798]
  24: (PGQueueable::RunVis::result_type 
boost::apply_visitor<PGQueueable::RunVis, 
boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, 
PGRecovery>&>(PGQueueable::RunVis&, 
boost::variant<boost::intrusive_ptr<OpRequest>, PGSnapTrim, PGScrub, 
PGRecovery>&)+0x2c) [0x1fbfb70]
  25: (PGQueueable::run(OSD*, boost::intrusive_ptr<PG>&, 
ThreadPool::TPHandle&)+0x5c) [0x1f9b6c8]
  26: (OSD::ShardedOpWQ::_process(unsigned int, 
ceph::heartbeat_handle_d*)+0x23e4) [0x1f737bc]
  27: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x508) 
[0x2a24f2c]
  28: (ShardedThreadPool::WorkThreadSharded::entry()+0x2c) [0x2a26a68]
  29: (Thread::entry_wrapper()+0xf4) [0x2c2de34]
  30: (Thread::_entry_func(void*)+0x18) [0x2c2dd28]
  NOTE: a copy of the executable, or `objdump -rdS <executable>` is 
needed to interpret this.
<pre>

Related issues

Related to bluestore - Bug #24211: SharedBlob::put() racy Resolved 05/21/2018

History

#1 Updated by Radoslaw Zarzynski 6 months ago

  • Status changed from New to In Progress

#2 Updated by Radoslaw Zarzynski 6 months ago

Hmm, some time ago there was an issue with racy `SharedBlob::put()`. Looks worth checking.

#3 Updated by Radoslaw Zarzynski 5 months ago

These crashes could be explained basing on:
  • the race in SharedBlob::put() that was fixed in v12.2.6,
  • memory reusage.

Please consider following scenario:

A: SharedBlob::put(foo) -> foo.nref := 0

B: SharedBlobSet::lookup(foo)
B: SharedBlob::get(foo) -> foo.nref := 1
B: SharedBlob::put(foo) -> foo.nref := 0

A: SharedBlobSet::try_remove(foo) -> returns true
A: delete foo

C: BlueStore::Collection::open_shared_blob(bar)
C: SharedBlobSet::lookup(bar) -> retuns nullptr
C: new bar // oops, &bar == &foo as the memory has been reused
C: SharedBlob::get(bar) -> bar.nref := 1

B: SharedBlobSet::try_remove(foo) -> returns true
B: delete bar // &bar == &foo

C: uses dted Blob with ref_map probably cleared

#4 Updated by Radoslaw Zarzynski 3 months ago

  • Related to Bug #24211: SharedBlob::put() racy added

#5 Updated by Radoslaw Zarzynski 3 months ago

  • Status changed from In Progress to Duplicate

Also available in: Atom PDF