Project

General

Profile

Actions

Bug #55662

closed

EC: Clay assert fail ../src/osd/ECBackend.cc: 685: FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.d ata_recovered_to))

Added by Nitzan Mordechai almost 2 years ago. Updated almost 2 years ago.

Status:
Rejected
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

../src/osd/ECBackend.cc: 685: FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.d
ata_recovered_to))

 ceph version 17.0.0-12051-g74e57a6ab2e (74e57a6ab2e0ed6bf0e8752f482d4cc820548083) quincy (dev)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x162) [0x5598138b4d2e]
 2: ceph-osd(+0x5d0f37) [0x5598138b4f37]
 3: (ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)+0x1d5c) [0x559813d8e0fc]
 4: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::v15_2_0::list, std::les
s<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::v15_2_0::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boo
st::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, std::optional<std::map<std::__cxx11::basic_string<char, std::char_trai
ts<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::alloca
tor<char> > const, ceph::buffer::v15_2_0::list> > > >, RecoveryMessages*)+0xb01) [0x559813d8ee11]
 5: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x70) [0x559813db2da0]
 6: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x8b) [0x559813d817fb]
 7: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*, ZTracer::Trace const&)+0x130f) [0x559813d9ad5f]
 8: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x19b) [0x559813d9b46b]
 9: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x4b) [0x559813b8286b]
 10: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x5c2) [0x559813b23f72]
 11: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x34b) [0x5598139ba3ab]
 12: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x559813c709e7]
 13: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xf88) [0x5598139d6b88]
 14: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5aa) [0x55981408d73a]
 15: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x559814090440]
 16: /lib64/libpthread.so.0(+0x817a) [0x7fe6cf33317a]
 17: clone()

Actions #1

Updated by Neha Ojha almost 2 years ago

Can you please add the test that helped you discover this issue? I believe the same test was passing with other EC plugins, so this issue seems to be bug only seen with the clay plugin.

Actions #2

Updated by Nitzan Mordechai almost 2 years ago

i used /qa/standalone/erasure-code/test-erasure-eio.sh, the test that failed is TEST_ec_object_attr_read_error when i replace the plugin to Clay, with other plugins all the tests pass.

Actions #3

Updated by Nitzan Mordechai almost 2 years ago

The test needed osd_read_ec_check_for_errors to be set to true, when it is set, the EIO error is ignored and we can get the sub-chanks correctly.

Actions #4

Updated by Nitzan Mordechai almost 2 years ago

  • Status changed from In Progress to Rejected
Actions

Also available in: Atom PDF