Bug #9209
osd/ECUtil.h: 66: FAILED assert(offset % stripe_width == 0)
100%
Description
Using
$ ceph --version ceph version 0.84-562-g8d40600 (8d406001d9b84d9809d181077c61ad9181934752)
The following teuthology job is scheduled
os_type: ubuntu os_version: '14.04' nuke-on-error: false overrides: ceph: conf: global: osd heartbeat grace: 40 mon: debug mon: 20 debug ms: 1 debug paxos: 20 mon warn on legacy crush tunables: false osd: debug filestore: 20 debug journal: 20 debug ms: 1 debug osd: 20 log-whitelist: - slow request - scrub mismatch - ScrubResult ceph-deploy: branch: dev: next conf: client: log file: /var/log/ceph/ceph-$name.$pid.log mon: debug mon: 1 debug ms: 20 debug paxos: 20 osd default pool size: 2 install: ceph: branch: master roles: - - mon.a - osd.0 - osd.1 - osd.2 - osd.3 - - mon.b - mon.c - osd.4 - osd.5 - osd.6 - osd.7 - - client.0 - osd.8 - osd.9 - osd.10 - osd.11 - osd.12 - osd.13 - osd.14 - osd.15 - osd.16 - osd.17 suite_path: /home/loic/software/ceph/ceph-qa-suite tasks: - install: branch: master - ceph: fs: xfs - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 min_in: 10 timeout: 1200 - rados: clients: [client.0] ops: 4000 objects: 500 ec_pool: true erasure_code_profile: plugin: jerasure k: 6 m: 2 technique: reed_sol_van ruleset-failure-domain: osd op_weights: read: 45 write: 0 append: 45 delete: 10
And crashes three OSDs with the following
osd/ECUtil.h: 66: FAILED assert(offset % stripe_width == 0) ceph version 0.84-562-g8d40600 (8d406001d9b84d9809d181077c61ad9181934752) 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xb6a24b] 2: ceph-osd() [0x9ea323] 3: (ECBackend::start_read_op(int, std::map<hobject_t, ECBackend::read_request_t, std::less<hobject_t>, std::allocator<std::pair<hobject_t const, ECBackend::read_request_t> > >&, std::tr1::shared_ptr<OpRequest>)+0x1019) [0x9f3509] 4: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0x624) [0x9f3d54] 5: (ECBackend::run_recovery_op(PGBackend::RecoveryHandle*, int)+0x2d1) [0x9fb331] 6: (ReplicatedPG::recover_primary(int, ThreadPool::TPHandle&)+0xaf9) [0x8538c9] 7: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*, ThreadPool::TPHandle&, int*)+0x54b) [0x885acb] 8: (OSD::do_recovery(PG*, ThreadPool::TPHandle&)+0x28b) [0x688a8b] 9: (OSD::RecoveryWQ::_process(PG*, ThreadPool::TPHandle&)+0x17) [0x6e80e7] 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa46) [0xb5b3d6] 11: (ThreadPool::WorkThread::entry()+0x10) [0xb5c480] 12: (()+0x8182) [0x7f9fd315f182] 13: (clone()+0x6d) [0x7f9fd16cb38d] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
Which is presumably a side effect of a failure to get the required number of OSDs
1.3 8 0 0 0 0 9691968 34 34 active+clean 2014-08-23 17:44:43.640373 45'34 45:117 [2147483647,15,10,13,6,2147483647,12,14] 15 [2147483647,15,10,13,6,2147483647,12,14] 15 0'0 2014-08-23 17:42:27.682870 0'0 2014-08-23 17:42:27.682870
using the generated ruleset
$ ceph osd crush rule dump unique_pool_0 { "rule_id": 1, "rule_name": "unique_pool_0", "ruleset": 1, "type": 3, "min_size": 3, "max_size": 20, "steps": [ { "op": "set_chooseleaf_tries", "num": 5}, { "op": "take", "item": -1, "item_name": "default"}, { "op": "choose_indep", "num": 0, "type": "osd"}, { "op": "emit"}]}
for a pool of size 8
pool 1 'unique_pool_0' erasure size 8 min_size 6 crush_ruleset 1 object_hash rjenkins pg_num 26 pgp_num 16 last_change 18 flags hashpspool stripe_width 4224 max_osd 18
Related issues
Associated revisions
common: ROUND_UP_TO accepts any rounding factor
The ROUND_UP_TO function was limited to rounding factors that are powers
of two. This saves a modulo but it is not used where it would make a
difference. The implementation is changed so it is generic.
http://tracker.ceph.com/issues/9209 Fixes: #9209
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
common: ROUND_UP_TO accepts any rounding factor
The ROUND_UP_TO function was limited to rounding factors that are powers
of two. This saves a modulo but it is not used where it would make a
difference. The implementation is changed so it is generic.
http://tracker.ceph.com/issues/9209 Fixes: #9209
Signed-off-by: Loic Dachary <loic-201408@dachary.org>
(cherry picked from commit 9449520b121fc6ce0c64948386d4ff77f46f4f5f)
History
#1 Updated by Loïc Dachary over 9 years ago
The same YAML file run against firefly 0.80.5-171-gca3ac90-1trusty instead of master succeeds.
#2 Updated by Loïc Dachary over 9 years ago
- Status changed from New to 12
The job above with k=2,m=1 passes
erasure_code_profile: plugin: jerasure k: 2 m: 1 technique: reed_sol_van ruleset-failure-domain: osd
#3 Updated by Loïc Dachary over 9 years ago
- Assignee set to Loïc Dachary
The teuthology job re-creating the problem is running on teuthology.front.sepia.com in screen -x -r 17865.loic
#4 Updated by Loïc Dachary over 9 years ago
The stripe width for k=6,m=2 is 4224 instead of the 4096 default. It probably breaks a requirement somewhere.
pool 2 'pool-jerasure' erasure size 8 min_size 6 crush_ruleset 2 object_hash rjenkins pg_num 12 pgp_num 12 last_change 64 flags hashpspool stripe_width 4224
#5 Updated by Loïc Dachary over 9 years ago
2014-08-25 13:15:52.293417 7fa8e74b6700 10 osd.1 pg_epoch: 13 pg[1.7s0( v 13'9 lc 11'1 (0'0,13'9] local-les=13 n=4 ec=10 les/c 13/11 12/12/12) [1,6,11,16,4,2,12,9] r=0 lpr=12 pi=10-11/1 luod=11'7 rops=3 crt=11'7 lcod 0'0 mlcod 0'0 active+recovering+degraded m=3] start_read_op: starting ReadOp(tid=55, to_read={c04f7d07/vpm03011797-46/head//1=read_request_t(to_read=[0,1048576], need=2(5),4(4),6(1),11(2),12(6),16(3), want_attrs=1),3dbfd9e7/vpm03011797-31/head//1=read_request_t(to_read=[0,1048576], need=2(5),4(4),6(1),11(2),12(6),16(3), want_attrs=1),133a4af7/vpm03011797-30/head//1=read_request_t(to_read=[0,1048576], need=2(5),4(4),6(1),11(2),12(6),16(3), want_attrs=1)}, complete={}, priority=10, obj_to_source={}, source_to_obj={}, in_progress=)
to_read=[0,1048576] will call offset_len_to_stripe_bounds and fail because 1048576=1024*1024 % stripe_width=4224 != 0
#6 Updated by Loïc Dachary over 9 years ago
In the logs RecoveryOp::IDLE shows
-86> 2014-08-25 13:15:52.292969 7fa8e74b6700 10 osd.1 pg_epoch: 13 pg[1.7s0( v 13'9 lc 11'1 (0'0,13'9] local-les=13 n=4 ec=10 les/c 13/11 12/12/12) [1,6,11,16,4,2,12,9] r=0 lpr=12 pi=10-11/1 luod=11'7 rops=3 crt=11'7 lcod 0'0 mlcod 0'0 active+recovering+degraded m=3] continue_recovery_op: IDLE return RecoveryOp(hoid=3dbfd9e7/vpm03011797-31/head//1 v=11'5 missing_on=1(0) missing_on_shards=0 recovery_info=ObjectRecoveryInfo(3dbfd9e7/vpm03011797-31/head//1@11'5, copy_subset: [], clone_subset: {}) recovery_progress=ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:true) pending_read=0 obc refcount=0 state=READING waiting_on_pushes= extent_requested=0,1048576
with extent_requested=0,1048576 although
ceph --admin-daemon /var/run/ceph/ceph-mon.a.asok config get osd_recovery_max_chunk { "osd_recovery_max_chunk": "8388608"}
and
pool 1 'unique_pool_0' erasure size 8 min_size 6 crush_ruleset 1 object_hash rjenkins pg_num 16 pgp_num 16 last_change 10 flags hashpspool stripe_width 4224
#7 Updated by Loïc Dachary over 9 years ago
ROUND_UP_TO only works with powers of 2.
$ perl -e '$n = 1 * 1024 * 1024 + 1; $d = 4224; $v = ((($n)+($d)-1) & ~(($d)-1)); print "$n => $v\n" ; print $v % $d ; print "\n"' 1048577 => 1048704 1152
#8 Updated by Loïc Dachary over 9 years ago
Teuthology must have changed the default recovery chunk for the OSDs at runtime because
sudo ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config get osd_recovery_max_chunk { "osd_recovery_max_chunk": "1048576"}
#9 Updated by Loïc Dachary over 9 years ago
- Status changed from 12 to Fix Under Review
With the proposed patch the above workload passes. An inspection of the OSD logs shows that recovery progress such as
2014-08-25 16:22:58.010631 7fd15ef15700 10 osd.1 pg_epoch: 299 pg[1.9s0( v 166'114 (0'0,166'114] local-les=292 n=3 ec=8 les/c 292/236 291/291/259) [1,12,15,5,10,2,11,6] r=0 lpr=291 pi=8-290/17 rops=3 crt=166'114 mlcod 0'0 active+recovering+degraded] continue_recovery_op: WRITING continue RecoveryOp(hoid=11402609/vpm03019476-166/head//1 v=166'114 missing_on=12(1) missing_on_shards=1 recovery_info=ObjectRecoveryInfo(11402609/vpm03019476-166/head//1@166'114, copy_subset: [], clone_subset: {}) recovery_progress=ObjectRecoveryProgress(!first, data_recovered_to:2103552, data_complete:false, omap_recovered_to:, omap_complete:true) pending_read=0 obc refcount=2 state=IDLE waiting_on_pushes= extent_requested=1051776,1051776
now has extent_requested=1051776,1051776 and other similar lines where 1051776 is 1024*1024 rounded to 4224
#10 Updated by Loïc Dachary over 9 years ago
- % Done changed from 0 to 80
- Backport set to firefly
#11 Updated by Loïc Dachary over 9 years ago
- Category set to common
- Target version set to 0.85 cont.
#12 Updated by Loïc Dachary over 9 years ago
#13 Updated by Sage Weil over 9 years ago
- Status changed from Fix Under Review to Pending Backport
#14 Updated by Loïc Dachary over 9 years ago
#15 Updated by Loïc Dachary over 9 years ago
- Status changed from Pending Backport to Resolved
#16 Updated by Loïc Dachary over 9 years ago
- % Done changed from 80 to 100