Project

General

Profile

Actions

Bug #65696

open

osd crashes when recovering PGs that have unfound objects

Added by Xuehan Xu 21 days ago. Updated 21 days ago.

Status:
New
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Crimson OSD doesn't handle unfound objects during recovery/backfill.

INFO  2024-04-29 12:12:09,204 [shard 0:main] osd - start_primary_recovery_ops 3:ac11be1a:::scephqa03.cpp.bjat.qianxin-inc.cn584690-125:4f item.need 211'429  (missing)
INFO  2024-04-29 12:12:09,204 [shard 0:main] osd - recover_missing 3:ac11be1a:::scephqa03.cpp.bjat.qianxin-inc.cn584690-125:4f v 211'429
INFO  2024-04-29 12:12:09,204 [shard 0:main] osd - recover_missing 3:ac11be1a:::scephqa03.cpp.bjat.qianxin-inc.cn584690-125:4f v 211'429, new recovery
DEBUG 2024-04-29 12:12:09,204 [shard 0:main] osd - recover_object: 3:ac11be1a:::scephqa03.cpp.bjat.qianxin-inc.cn584690-125:4f, 211'429
DEBUG 2024-04-29 12:12:09,204 [shard 0:main] osd - maybe_pull_missing_obj: 3:ac11be1a:::scephqa03.cpp.bjat.qianxin-inc.cn584690-125:4f, 211'429
DEBUG 2024-04-29 12:12:09,204 [shard 0:main] osd -  pg_epoch 228 pg[3.5( v 228'438 lc 208'428 (0'0,228'438] local-lis/les=227/228 n=77 ec=15/15 lis/c=227/198 les/c/f=228/199/0 sis=227) [1,3] r=0 lpr=227 pi=[198,227)/2 luod=228'439 lua=225'432 crt=228'439 mlcod 203'422 active+recovering+undersized+degraded  ObjectContextLoader::with_head_obc: object 3:ac11be1a:::scephqa03.cpp.bjat.qianxin-inc.cn584690-125:head
DEBUG 2024-04-29 12:12:09,204 [shard 0:main] osd -  pg_epoch 228 pg[3.5( v 228'438 lc 208'428 (0'0,228'438] local-lis/les=227/228 n=77 ec=15/15 lis/c=227/198 les/c/f=228/199/0 sis=227) [1,3] r=0 lpr=227 pi=[198,227)/2 luod=228'439 lua=225'432 crt=228'439 mlcod 203'422 active+recovering+undersized+degraded  ObjectContextLoader::get_or_load_obc: cache hit on 3:ac11be1a:::scephqa03.cpp.bjat.qianxin-inc.cn584690-125:head
DEBUG 2024-04-29 12:12:09,204 [shard 0:main] osd - prepare_pull: 3:ac11be1a:::scephqa03.cpp.bjat.qianxin-inc.cn584690-125:4f, 211'429
ceph-osd: /home/xuxuehan/nvme/rpmbuild/BUILD/ceph-19.0.0-3239-gc600a248ed3/src/crimson/osd/replicated_recovery_backend.cc:434: void ReplicatedRecoveryBackend::prepare_pull(const crimson::osd::ObjectContextRef&, PullOp&, RecoveryBackend::pull_info_t&, const hobject_t&, eversion_t): Assertion `m.contains(soid)' failed.
Aborting on shard 0.
Backtrace:
Reactor stalled for 208 ms on shard 0. Backtrace: 0x2f0bb8d 0x2ec04ad 0x2ec07f8 0x2ec0987 0x12cdf 0x169301 0x1d0c810 0x1d0da2c 0x1d0e747 0x1d0b4e6 0x1d0bec0 0x1d0c378 0x12cdf 0x4ea4e 0x21db4 0x21c88 0x473a5 0x1831d93 0x183273f 0x1832ba9 0x165782d 0x1657a53 0x1657c6e 0x165f163 0x165f5b0 0x165f7be 0x165fd23 0x16601f8 0x1667b63 0x180eb20 0x1810a46 0x17c9737 0x17caeb7 0x17cd5e8 0x1744eaa 0x174f55a 0x174f92d 0x17545df 0x1756ac6 0x2eb8357 0x2eb8773 0x2ef9da5 0x2efaaec 0x2e48c5c 0x2e494e4 0x13771ea 0x3aca2 0x13cc00d
kernel callstack:
 0# gsignal in /lib64/libc.so.6
 1# abort in /lib64/libc.so.6
 2# 0x00002B5097C69C89 in /lib64/libc.so.6
 3# 0x00002B5097C8F3A6 in /lib64/libc.so.6
 4# ReplicatedRecoveryBackend::prepare_pull(boost::intrusive_ptr<crimson::osd::ObjectContext> const&, PullOp&, RecoveryBackend::pull_info_t&, hobject_t const&, eversion_t) in ceph-osd
 5# 0x0000000001832740 in ceph-osd
 6# 0x0000000001832BAA in ceph-osd
 7# crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::_future<crimson::errorated_future_marker<void> > > seastar::futurize<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::_future<crimson::errorated_future_marker<void> > > >::invoke<crimson::osd::ObjectContextLoader::with_head_obc<(RWState::State)1>(boost::intrusive_ptr<crimson::osd::ObjectContext>, bool, std::function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::_future<crimson::errorated_future_marker<void> > > (boost::intrusive_ptr<crimson::osd::ObjectContext>, boost::intrusive_ptr<crimson::osd::ObjectContext>)>&&)::{lambda()#1}::operator()() const::{lambda(auto:1)#1}, boost::intrusive_ptr<crimson::osd::ObjectContext> >({lambda()#1}&&, boost::intrusive_ptr<crimson::osd::ObjectContext>&&) in ceph-osd
 8# auto crimson::interruptible::internal::call_with_interruption_impl<crimson::osd::IOInterruptCondition, crimson::osd::ObjectContextLoader::with_head_obc<(RWState::State)1>(boost::intrusive_ptr<crimson::osd::ObjectContext>, bool, std::function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::_future<crimson::errorated_future_marker<void> > > (boost::intrusive_ptr<crimson::osd::ObjectContext>, boost::intrusive_ptr<crimson::osd::ObjectContext>)>&&)::{lambda()#1}::operator()() const::{lambda(auto:1)#1}, boost::intrusive_ptr<crimson::osd::ObjectContext> >(seastar::lw_shared_ptr<{lambda()#1}>, crimson::osd::ObjectContextLoader::with_head_obc<(RWState::State)1>(boost::intrusive_ptr<crimson::osd::ObjectContext>, bool, std::function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::_future<crimson::errorated_future_marker<void> > > (boost::intrusive_ptr<crimson::osd::ObjectContext>, boost::intrusive_ptr<crimson::osd::ObjectContext>)>&&)::{lambda()#1}::operator()() const::{lambda(auto:1)#1}&&, boost::intrusive_ptr<crimson::osd::ObjectContext>&&) in ceph-osd
 9# auto crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<crimson::osd::ObjectContext> > >::safe_then<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::_future<crimson::errorated_future_marker<boost::intrusive_ptr<crimson::osd::ObjectContext> > > >::safe_then_interruptible<true, crimson::osd::ObjectContextLoader::with_head_obc<(RWState::State)1>(boost::intrusive_ptr<crimson::osd::ObjectContext>, bool, std::function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::_future<crimson::errorated_future_marker<void> > > (boost::intrusive_ptr<crimson::osd::ObjectContext>, boost::intrusive_ptr<crimson::osd::ObjectContext>)>&&)::{lambda()#1}::operator()() const::{lambda(auto:1)#1}, boost::intrusive_ptr<crimson::osd::ObjectContext>, 0>(crimson::osd::ObjectContextLoader::with_head_obc<(RWState::State)1>(boost::intrusive_ptr<crimson::osd::ObjectContext>, bool, std::function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::_future<crimson::errorated_future_marker<void> > > (boost::intrusive_ptr<crimson::osd::ObjectContext>, boost::intrusive_ptr<crimson::osd::ObjectContext>)>&&)::{lambda()#1}::operator()() const::{lambda(auto:1)#1}&&)::{lambda(boost::intrusive_ptr<crimson::osd::ObjectContext>&&)#1}, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::pass_further>({lambda()#1}&&, crimson::osd::ObjectContextLoader::with_head_obc<(RWState::State)1>(boost::intrusive_ptr<crimson::osd::ObjectContext>, bool, std::function<crimson::interruptible::interruptible_future_detail<crimson::osd::IOInterruptCondition, crimson::errorator<crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<2> >, crimson::unthrowable_wrapper<std::error_code const&, crimson::ec<84> > >::_future<crimson::errorated_future_marker<void> > > (boost::intrusive_ptr<crimson::osd::ObjectContext>, boost::intrusive_ptr<crimson::osd::ObjectContext>)>&&)::{lambda()#1}::operator()() const::{lambda(auto:1)#1})::{lambda({lambda(boost::intrusive_ptr<crimson::osd::ObjectContext>&&)#1})#1}::operator()<seastar::future<boost::intrusive_ptr<crimson::osd::ObjectContext> > >({lambda(boost::intrusive_ptr<crimson::osd::ObjectContext>&&)#1}) in ceph-osd
Actions #1

Updated by Xuehan Xu 21 days ago

  • Pull request ID set to 57147
Actions

Also available in: Atom PDF