Bug #51627
closedFAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soid) || (it_objects != recovery_state.get_pg_log().get_log().objects.end() && it_objects->second->op == pg_log_entry_t::LOST_REVERT))
0%
Description
spotted again,
2021-07-11T02:43:55.694+0000 7ffa80f0e700 -1 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0 .0-5874-g16eb42a1/rpm/el8/BUILD/ceph-17.0.0-5874-g16eb42a1/src/osd/PrimaryLogPG.cc: In function 'ObjectContextRef PrimaryLogPG::get_object_context(const hobject_t&, bool, const std::map<std::__cxx11::basi c_string<char>, ceph::buffer::v15_2_0::list, std::less<void> >*)' thread 7ffa80f0e700 time 2021-07-11T02:43:55.690510+0000 /home/jenkins-build/build/workspace/ceph-dev-new-build/ARCH/x86_64/AVAILABLE_ARCH/x86_64/AVAILABLE_DIST/centos8/DIST/centos8/MACHINE_SIZE/gigantic/release/17.0.0-5874-g16eb42a1/rpm/el8/BUILD/ceph-17.0.0-5874-g16eb42a1/src/osd/PrimaryLogPG.cc: 11784: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soid) || (it_objects != recovery_state.get_pg_log().get_log().objects.end() && it_objects->second->op == pg_log_entry_t::LOST_REVERT)) ceph version 17.0.0-5874-g16eb42a1 (16eb42a1d8cef5cf008b04b27d51e13dbd6ec495) quincy (dev) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x152) [0x55f1f3750606] 2: ceph-osd(+0x5bf827) [0x55f1f3750827] 3: (PrimaryLogPG::get_object_context(hobject_t const&, bool, std::map<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const, ceph::buffer::v15_2_0::list> > > const*)+0x22f) [0x55f1f39670df] 4: (PrimaryLogPG::get_adjacent_clones(std::shared_ptr<ObjectContext>, std::shared_ptr<ObjectContext>&, std::shared_ptr<ObjectContext>&)+0xc5) [0x55f1f3968845] 5: (PrimaryLogPG::inc_refcount_by_set(PrimaryLogPG::OpContext*, object_manifest_t&, OSDOp&)+0xd3) [0x55f1f396c4d3] 6: (PrimaryLogPG::do_osd_ops(PrimaryLogPG::OpContext*, std::vector<OSDOp, std::allocator<OSDOp> >&)+0xe634) [0x55f1f39b4234] 7: (PrimaryLogPG::prepare_transaction(PrimaryLogPG::OpContext*)+0x177) [0x55f1f39babd7] 8: (PrimaryLogPG::execute_ctx(PrimaryLogPG::OpContext*)+0x31d) [0x55f1f39bccbd] 9: (PrimaryLogPG::do_op(boost::intrusive_ptr<OpRequest>&)+0x2dbb) [0x55f1f39c674b] 10: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0xd1c) [0x55f1f39cd93c] 11: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x309) [0x55f1f3856c99] 12: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x68) [0x55f1f3ab9a18] 13: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xc28) [0x55f1f3873788] 14: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5c4) [0x55f1f3f105c4] 15: (ShardedThreadPool::WorkThreadSharded::entry()+0x14) [0x55f1f3f11964] 16: (Thread::_entry_func(void*)+0xd) [0x55f1f3ef768d] 17: /lib64/libpthread.so.0(+0x814a) [0x7ffaa6d8a14a] 18: clone()
/a/ksirivad-2021-07-11_01:45:00-rados-wip-pg-autoscaler-overlap-distro-basic-smithi
the branch being tested was based on 0509deb6a895a98e3e582cbb849606bc559b963c, and included a fix in mgr module. see https://github.com/ceph/ceph/pull/42036
Updated by Myoungwon Oh almost 3 years ago
Updated by Kefu Chai almost 3 years ago
- Status changed from New to Fix Under Review
- Assignee set to Myoungwon Oh
- Pull request ID set to 42279
Updated by Kamoltat (Junior) Sirivadhna almost 3 years ago
spotted again at ksirivad-2021-07-11_01:45:00-rados-wip-pg-autoscaler-overlap-distro-basic-smithi/6262857/
Updated by Neha Ojha over 2 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to pacific
Updated by Backport Bot over 2 years ago
- Copied to Backport #51952: pacific: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soid) || (it_objects != recovery_state.get_pg_log().get_log().objects.end() && it_objects->second->op == pg_log_entry_t::LOST_REVERT)) added
Updated by Loïc Dachary over 2 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Updated by Laura Flores about 2 years ago
Happened again. Could this be a new occurrence?
/a/yuriw-2022-02-21_15:40:41-rados-wip-yuri4-testing-2022-02-18-0800-distro-default-smithi/6698453
Updated by Myoungwon Oh about 2 years ago
The error message looks like similar before, but the cause is difference from the prior case.
Anyway, I posted the fix.
Updated by Neha Ojha about 2 years ago
Myoungwon Oh wrote:
The error message looks like similar before, but the cause is difference from the prior case.
Anyway, I posted the fix.
Thanks for looking into it! I think we should open a different tracker issue for this new fix.
Updated by Myoungwon Oh about 2 years ago
Updated by Aishwarya Mathuria about 2 years ago
Saw the same assert failure here: /a/yuriw-2022-03-31_21:45:19-rados-wip-yuri5-testing-2022-03-31-1158-quincy-distro-default-smithi/6770156
Updated by Laura Flores 9 months ago
- Related to Bug #62167: FAILED ceph_assert(attrs || !recovery_state.get_pg_log().get_missing().is_missing(soid) || (it_objects != recovery_state.get_pg_log().get_log().objects.end() && it_objects->second->op == pg_log_entry_t::LOST_REVERT)) added