Project

General

Profile

Actions

Bug #64728

closed

osd crashes when there are enough number of pgs in a single seastore based crimson-osd

Added by Xuehan Xu 2 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

#0  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139941722424064) at ./nptl/pthread_kill.c:44
#1  __pthread_kill_internal (signo=6, threadid=139941722424064) at ./nptl/pthread_kill.c:78
#2  __GI___pthread_kill (threadid=139941722424064, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#3  0x00007f46b937d476 in __GI_raise (sig=6) at ../sysdeps/posix/raise.c:26
#4  0x0000561b577a8c57 in reraise_fatal (signum=6) at /home/xuxuehan/src/ceph/src/crimson/common/fatal_signal.cc:41
#5  0x0000561b577a7f5b in FatalSignal::signal_entry (signum=6, info=0x7ffd901faef0) at /home/xuxuehan/src/ceph/src/crimson/common/fatal_signal.cc:62
#6  <signal handler called>
#7  __pthread_kill_implementation (no_tid=0, signo=6, threadid=139941722424064) at ./nptl/pthread_kill.c:44
#8  __pthread_kill_internal (signo=6, threadid=139941722424064) at ./nptl/pthread_kill.c:78
#9  __GI___pthread_kill (threadid=139941722424064, signo=signo@entry=6) at ./nptl/pthread_kill.c:89
#10 0x00007f46b937d476 in __GI_raise (sig=sig@entry=6) at ../sysdeps/posix/raise.c:26
#11 0x00007f46b93637f3 in __GI_abort () at ./stdlib/abort.c:79
#12 0x00007f46b936371b in __assert_fail_base (fmt=0x7f46b9518130 "%s%s%s:%u: %s%sAssertion `%s' failed.\n%n", assertion=0x561b69354540 <str> "!delta_buffer.empty()",
    file=0x561b6723d4e2 "/home/xuxuehan/src/ceph/src/crimson/os/seastore/collection_manager/collection_flat_node.h", line=156, function=<optimized out>) at ./assert/assert.c:92
#13 0x00007f46b9374e96 in __GI___assert_fail (assertion=0x561b69354540 <str> "!delta_buffer.empty()", file=0x561b6723d4e2 "/home/xuxuehan/src/ceph/src/crimson/os/seastore/collection_manager/collection_flat_node.h", line=156,
    function=0x561b69354580 <__PRETTY_FUNCTION__._ZN7crimson2os8seastore18collection_manager14CollectionNode9get_deltaEv> "virtual ceph::bufferlist crimson::os::seastore::collection_manager::CollectionNode::get_delta()") at ./assert/assert.c:101
#14 0x0000561b5cf413fb in crimson::os::seastore::collection_manager::CollectionNode::get_delta (this=0x61400008ae40) at /home/xuxuehan/src/ceph/src/crimson/os/seastore/collection_manager/collection_flat_node.h:156
#15 0x0000561b5b212e09 in crimson::os::seastore::Cache::mark_transaction_conflicted (this=0x627000003900, t=..., conflicting_extent=...) at /home/xuxuehan/src/ceph/src/crimson/os/seastore/cache.cc:872
#16 0x0000561b5b20def1 in crimson::os::seastore::Cache::invalidate_extent (this=0x627000003900, t=..., extent=...) at /home/xuxuehan/src/ceph/src/crimson/os/seastore/cache.cc:828
#17 0x0000561b5b2113fe in crimson::os::seastore::Cache::commit_replace_extent (this=0x627000003900, t=..., next=..., prev=...) at /home/xuxuehan/src/ceph/src/crimson/os/seastore/cache.cc:805
#18 0x0000561b5b2200db in crimson::os::seastore::Cache::prepare_record (this=0x627000003900, t=..., journal_head=..., journal_dirty_tail=...) at /home/xuxuehan/src/ceph/src/crimson/os/seastore/cache.cc:1150
#19 0x0000561b5afc7478 in crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25::operator()() (
    this=0x6060004fc7c0) at /home/xuxuehan/src/ceph/src/crimson/os/seastore/transaction_manager.cc:353
#20 0x0000561b5afc690f in seastar::futurize<crimson::interruptible::interruptible_future_detail<crimson::os::seastore::TransactionConflictCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<5> > >::_future<crimson::errorated_future_marker<void> > > >::invoke<crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25>(crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25&&) (func=...)
    at /home/xuxuehan/src/ceph/src/crimson/common/interruptible_future.h:1627
#21 0x0000561b5afc67c0 in seastar::futurize_invoke<crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25>(crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25&&) (func=...)
    at /home/xuxuehan/src/ceph/src/seastar/include/seastar/core/future.hh:2055
#22 0x0000561b5afc64db in crimson::interruptible::internal::call_with_interruption_impl<crimson::os::seastore::TransactionConflictCondition, crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25>(seastar::lw_shared_ptr<crimson::os::seastore::TransactionConflictCondition>, crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25&&) (interrupt_condition=..., func=...) at /home/xuxuehan/src/ceph/src/crimson/common/interruptible_future.h:207
#23 0x0000561b5afc57bf in crimson::interruptible::call_with_interruption<crimson::os::seastore::TransactionConflictCondition, crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25, crimson::interruptible::interruptible_future_detail<crimson::os::seastore::TransactionConflictCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<5> > >::_future<crimson::errorated_future_marker<void> > > >(seastar::lw_shared_ptr<crimson::os::seastore::TransactionConflictCondition>, crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25&&) (interrupt_condition=..., func=...) at /home/xuxuehan/src/ceph/src/crimson/common/interruptible_future.h:277
#24 0x0000561b5afc51e3 in crimson::interruptible::interruptible_future_detail<crimson::os::seastore::TransactionConflictCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<5> > >::_future<crimson::errorated_future_marker<void> > >::safe_then_interruptible<true, crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25, void, 0>(crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25&&)::{lambda()#1}::operator()() (
    this=0x6060004fc7c0) at /home/xuxuehan/src/ceph/src/crimson/common/interruptible_future.h:872
#25 0x0000561b5afc4eff in seastar::futurize<crimson::interruptible::interruptible_future_detail<crimson::os::seastore::TransactionConflictCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<5> > >::_future<crimson::errorated_future_marker<void> > > >::invoke<crimson::interruptible::interruptible_future_detail<crimson::os::seastore::TransactionConflictCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<5> > >::_future<crimson::errorated_future_marker<void> > >::safe_then_interruptible<true, crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25, void, 0>(crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25&&)::{lambda()#1}>(crimson::interruptible::interruptible_future_detail<crimson::os::seastore::TransactionConflictCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<5> > >::_future<crimson::errorated_future_marker<void> > >::safe_then_interruptible<true, crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25, void, 0>(crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25&&)::{lambda()#1}&&) (func=...)
    at /home/xuxuehan/src/ceph/src/crimson/common/interruptible_future.h:1627
#26 0x0000561b5afc4db0 in seastar::futurize_invoke<crimson::interruptible::interruptible_future_detail<crimson::os::seastore::TransactionConflictCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<5> > >::_future<crimson::errorated_future_marker<void> > >::safe_then_interruptible<true, crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25, void, 0>(crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25&&)::{lambda()#1}>(crimson::interruptible::interruptible_future_detail<crimson::os::seastore::TransactionConflictCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<5> > >::_future<crimson::errorated_future_marker<void> > >::safe_then_interruptible<true, crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25, void, 0>(crimson::os::seastore::TransactionManager::do_submit_transaction(crimson::os::seastore::Transaction&, crimson::os::seastore::ExtentPlacementManager::dispatch_result_t, std::optional<crimson::os::seastore::journal_seq_t>)::$_25&&)::{lambda()#1}&&) (func=...)
    at /home/xuxuehan/src/ceph/src/seastar/include/seastar/core/future.hh:2055

The scenario was like this:
1. create a new mutable collection node to insert a new collection;
2. on collection node overflow, we don't touch the new mutable collection's delta buffer, and create a 2-times larger new collection node and insert the new collection;
3. the mutable collection node resides in the transaction's mutated_block_list, and when being invalidated, its "get_delta()" method hits the assert "!delta_buffer.empty()".

Seems that we should always retire the newly created mutable collection node immediately (in the same continuation) in this case.

Actions #1

Updated by Xuehan Xu 2 months ago

  • Description updated (diff)
Actions #2

Updated by Xuehan Xu 2 months ago

  • Description updated (diff)
Actions #3

Updated by Xuehan Xu 2 months ago

  • Pull request ID set to 55981
Actions #4

Updated by Xuehan Xu about 2 months ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF