Actions
Bug #55662
closedEC: Clay assert fail ../src/osd/ECBackend.cc: 685: FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.d ata_recovered_to))
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
../src/osd/ECBackend.cc: 685: FAILED ceph_assert(pop.data.length() == sinfo.aligned_logical_offset_to_chunk_offset( after_progress.data_recovered_to - op.recovery_progress.d
ata_recovered_to))
ceph version 17.0.0-12051-g74e57a6ab2e (74e57a6ab2e0ed6bf0e8752f482d4cc820548083) quincy (dev)
1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x162) [0x5598138b4d2e]
2: ceph-osd(+0x5d0f37) [0x5598138b4f37]
3: (ECBackend::continue_recovery_op(ECBackend::RecoveryOp&, RecoveryMessages*)+0x1d5c) [0x559813d8e0fc]
4: (ECBackend::handle_recovery_read_complete(hobject_t const&, boost::tuples::tuple<unsigned long, unsigned long, std::map<pg_shard_t, ceph::buffer::v15_2_0::list, std::les
s<pg_shard_t>, std::allocator<std::pair<pg_shard_t const, ceph::buffer::v15_2_0::list> > >, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boo
st::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type, boost::tuples::null_type>&, std::optional<std::map<std::__cxx11::basic_string<char, std::char_trai
ts<char>, std::allocator<char> >, ceph::buffer::v15_2_0::list, std::less<void>, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits<char>, std::alloca
tor<char> > const, ceph::buffer::v15_2_0::list> > > >, RecoveryMessages*)+0xb01) [0x559813d8ee11]
5: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0x70) [0x559813db2da0]
6: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x8b) [0x559813d817fb]
7: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*, ZTracer::Trace const&)+0x130f) [0x559813d9ad5f]
8: (ECBackend::_handle_message(boost::intrusive_ptr<OpRequest>)+0x19b) [0x559813d9b46b]
9: (PGBackend::handle_message(boost::intrusive_ptr<OpRequest>)+0x4b) [0x559813b8286b]
10: (PrimaryLogPG::do_request(boost::intrusive_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x5c2) [0x559813b23f72]
11: (OSD::dequeue_op(boost::intrusive_ptr<PG>, boost::intrusive_ptr<OpRequest>, ThreadPool::TPHandle&)+0x34b) [0x5598139ba3ab]
12: (ceph::osd::scheduler::PGOpItem::run(OSD*, OSDShard*, boost::intrusive_ptr<PG>&, ThreadPool::TPHandle&)+0x67) [0x559813c709e7]
13: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0xf88) [0x5598139d6b88]
14: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x5aa) [0x55981408d73a]
15: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x559814090440]
16: /lib64/libpthread.so.0(+0x817a) [0x7fe6cf33317a]
17: clone()
Updated by Neha Ojha almost 2 years ago
Can you please add the test that helped you discover this issue? I believe the same test was passing with other EC plugins, so this issue seems to be bug only seen with the clay plugin.
Updated by Nitzan Mordechai almost 2 years ago
i used /qa/standalone/erasure-code/test-erasure-eio.sh, the test that failed is TEST_ec_object_attr_read_error when i replace the plugin to Clay, with other plugins all the tests pass.
Updated by Nitzan Mordechai almost 2 years ago
The test needed osd_read_ec_check_for_errors to be set to true, when it is set, the EIO error is ignored and we can get the sub-chanks correctly.
Updated by Nitzan Mordechai almost 2 years ago
- Status changed from In Progress to Rejected
Actions