Actions
Bug #13937
closedosd/ECBackend.cc: 201: FAILED assert(res.errors.empty())
Status:
Resolved
Priority:
High
Assignee:
David Zafman
Category:
-
Target version:
-
% Done:
0%
Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
I am getting the following error with ceph v9.2.0 on two different OSDs. The error occurs shortly after startup.
osd/ECBackend.cc: 201: FAILED assert(res.errors.empty()) ceph version 9.2.0 (bb2ecea240f3a1d525bcb35670cb07bd1f0ca299) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x80) [0x55b5305f5e73] 2: (OnRecoveryReadComplete::finish(std::pair<RecoveryMessages*, ECBackend::read_result_t&>&)+0xe3) [0x55b5304987af] 3: (ECBackend::complete_read_op(ECBackend::ReadOp&, RecoveryMessages*)+0x59) [0x55b530484ad1] 4: (ECBackend::handle_sub_read_reply(pg_shard_t, ECSubReadReply&, RecoveryMessages*)+0x1008) [0x55b530485ea6] 5: (ECBackend::handle_message(std::shared_ptr<OpRequest>)+0x216) [0x55b530489c66] 6: (ReplicatedPG::do_request(std::shared_ptr<OpRequest>&, ThreadPool::TPHandle&)+0x1c0) [0x55b53029ea4e] 7: (OSD::dequeue_op(boost::intrusive_ptr<PG>, std::shared_ptr<OpRequest>, ThreadPool::TPHandle&)+0x40c) [0x55b5300ffc1e] 8: (PGQueueable::RunVis::operator()(std::shared_ptr<OpRequest>&)+0x52) [0x55b5300ffe52] 9: (OSD::ShardedOpWQ::_process(unsigned int, ceph::heartbeat_handle_d*)+0x61f) [0x55b5301172c7] 10: (ShardedThreadPool::shardedthreadpool_worker(unsigned int)+0x7ae) [0x55b5305e3b16] 11: (ShardedThreadPool::WorkThreadSharded::entry()+0x10) [0x55b5305e7060] 12: (Thread::entry_wrapper()+0x64) [0x55b5305d7c2c] 13: (()+0x7176) [0x7f4cc5cdd176] 14: (clone()+0x6d) [0x7f4cc3fb149d]
Configuration is rather simple:
3x mirror ("data")
3x mirror ("datahot", cache-tier for "dataec")
8+3 jerasure ("dataec")
pool 0 'data' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 2048 pgp_num 2048 last_change 72574 flags hashpspool crash_replay_interval 45 min_read_recency_for_promote 1 min_write_recency_for_promote 1 stripe_width 0 pool 1 'metadata' replicated size 4 min_size 2 crush_ruleset 1 object_hash rjenkins pg_num 256 pgp_num 256 last_change 72577 flags hashpspool min_read_recency_for_promote 1 min_write_recency_for_promote 1 stripe_width 0 pool 2 'rbd' replicated size 3 min_size 2 crush_ruleset 0 object_hash rjenkins pg_num 256 pgp_num 256 last_change 31 flags hashpspool min_read_recency_for_promote 1 min_write_recency_for_promote 1 stripe_width 0 pool 5 'datahot' replicated size 3 min_size 2 crush_ruleset 3 object_hash rjenkins pg_num 128 pgp_num 128 last_change 185334 flags hashpspool,incomplete_clones tier_of 6 cache_mode writeback target_bytes 1099511627776 hit_set bloom{false_positive_probability: 0.05, target_size: 0, seed: 0} 3600s x1 stripe_width 0 pool 6 'dataec' erasure size 11 min_size 8 crush_ruleset 4 object_hash rjenkins pg_num 1024 pgp_num 1024 last_change 185334 lfor 185334 flags hashpspool tiers 5 read_tier 5 write_tier 5 stripe_width 131072
data and datahot are used as data-pools for cephfs. All pools share the same OSDs.
Files
Actions