Project

General

Profile

Bug #9209

osd/ECUtil.h: 66: FAILED assert(offset % stripe_width == 0)

Added by Loïc Dachary over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
common
Target version:
% Done:

100%

Source:
Development
Tags:
Backport:
firefly
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Using

$ ceph --version
ceph version 0.84-562-g8d40600 (8d406001d9b84d9809d181077c61ad9181934752)

The following teuthology job is scheduled
os_type: ubuntu
os_version: '14.04'
nuke-on-error: false
overrides:
  ceph:
    conf:
      global:
        osd heartbeat grace: 40
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
        mon warn on legacy crush tunables: false
      osd:
        debug filestore: 20
        debug journal: 20
        debug ms: 1
        debug osd: 20
    log-whitelist:
    - slow request
    - scrub mismatch
    - ScrubResult
  ceph-deploy:
    branch:
      dev: next
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
        osd default pool size: 2
  install:
    ceph:
      branch: master
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
  - osd.3
- - mon.b
  - mon.c
  - osd.4
  - osd.5
  - osd.6
  - osd.7
- - client.0
  - osd.8
  - osd.9
  - osd.10
  - osd.11
  - osd.12
  - osd.13
  - osd.14
  - osd.15
  - osd.16
  - osd.17
suite_path: /home/loic/software/ceph/ceph-qa-suite
tasks:
- install:
    branch: master
- ceph:
    fs: xfs
- thrashosds:
    chance_pgnum_grow: 1
    chance_pgpnum_fix: 1
    min_in: 10
    timeout: 1200
- rados:
    clients: [client.0]
    ops: 4000
    objects: 500
    ec_pool: true
    erasure_code_profile:
      plugin: jerasure
      k: 6
      m: 2
      technique: reed_sol_van
      ruleset-failure-domain: osd
    op_weights:
      read: 45
      write: 0
      append: 45
      delete: 10

And crashes three OSDs with the following
osd/ECUtil.h: 66: FAILED assert(offset % stripe_width == 0)

 ceph version 0.84-562-g8d40600 (8d406001d9b84d9809d181077c61ad9181934752)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xb6a24b]
 2: ceph-osd() [0x9ea323]
 3: (ECBackend::start_read_op(int, std::map<hobject_t, ECBackend::read_request_t, std::less<hobject_t>, std::allocator<std::pair<hobject_t const, ECBackend::read_request_t> > >&, std::tr1::shared_ptr<OpRequest>)+0x1019) [0x9f3509]
 4: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0x624) [0x9f3d54]
 5: (ECBackend::run_recovery_op(PGBackend::RecoveryHandle*, int)+0x2d1) [0x9fb331]
 6: (ReplicatedPG::recover_primary(int, ThreadPool::TPHandle&)+0xaf9) [0x8538c9]
 7: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*, ThreadPool::TPHandle&, int*)+0x54b) [0x885acb]
 8: (OSD::do_recovery(PG*, ThreadPool::TPHandle&)+0x28b) [0x688a8b]
 9: (OSD::RecoveryWQ::_process(PG*, ThreadPool::TPHandle&)+0x17) [0x6e80e7]
 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa46) [0xb5b3d6]
 11: (ThreadPool::WorkThread::entry()+0x10) [0xb5c480]
 12: (()+0x8182) [0x7f9fd315f182]
 13: (clone()+0x6d) [0x7f9fd16cb38d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Which is presumably a side effect of a failure to get the required number of OSDs
1.3    8    0    0    0    0    9691968    34    34    active+clean    2014-08-23 17:44:43.640373    45'34    45:117    [2147483647,15,10,13,6,2147483647,12,14]    15    [2147483647,15,10,13,6,2147483647,12,14]    15    0'0    2014-08-23 17:42:27.682870    0'0    2014-08-23 17:42:27.682870

using the generated ruleset
$ ceph osd crush rule dump unique_pool_0
{ "rule_id": 1,
  "rule_name": "unique_pool_0",
  "ruleset": 1,
  "type": 3,
  "min_size": 3,
  "max_size": 20,
  "steps": [
        { "op": "set_chooseleaf_tries",
          "num": 5},
        { "op": "take",
          "item": -1,
          "item_name": "default"},
        { "op": "choose_indep",
          "num": 0,
          "type": "osd"},
        { "op": "emit"}]}

for a pool of size 8
pool 1 'unique_pool_0' erasure size 8 min_size 6 crush_ruleset 1 object_hash rjenkins pg_num 26 pgp_num 16 last_change 18 flags hashpspool stripe_width 4224
max_osd 18

ceph-osd.1.log View (10.3 MB) Loïc Dachary, 08/23/2014 11:23 AM


Related issues

Related to Ceph - Cleanup #9225: check that ROUND_UP_TO is not used with improper rounding values Closed 08/25/2014

Associated revisions

Revision 9449520b (diff)
Added by Loic Dachary over 9 years ago

common: ROUND_UP_TO accepts any rounding factor

The ROUND_UP_TO function was limited to rounding factors that are powers
of two. This saves a modulo but it is not used where it would make a
difference. The implementation is changed so it is generic.

http://tracker.ceph.com/issues/9209 Fixes: #9209

Signed-off-by: Loic Dachary <>

Revision 87cd3a8f (diff)
Added by Loic Dachary over 9 years ago

common: ROUND_UP_TO accepts any rounding factor

The ROUND_UP_TO function was limited to rounding factors that are powers
of two. This saves a modulo but it is not used where it would make a
difference. The implementation is changed so it is generic.

http://tracker.ceph.com/issues/9209 Fixes: #9209

Signed-off-by: Loic Dachary <>
(cherry picked from commit 9449520b121fc6ce0c64948386d4ff77f46f4f5f)

History

#1 Updated by Loïc Dachary over 9 years ago

The same YAML file run against firefly 0.80.5-171-gca3ac90-1trusty instead of master succeeds.

#2 Updated by Loïc Dachary over 9 years ago

  • Status changed from New to 12

The job above with k=2,m=1 passes

    erasure_code_profile:
      plugin: jerasure
      k: 2
      m: 1
      technique: reed_sol_van
      ruleset-failure-domain: osd

#3 Updated by Loïc Dachary over 9 years ago

  • Assignee set to Loïc Dachary

The teuthology job re-creating the problem is running on teuthology.front.sepia.com in screen -x -r 17865.loic

#4 Updated by Loïc Dachary over 9 years ago

The stripe width for k=6,m=2 is 4224 instead of the 4096 default. It probably breaks a requirement somewhere.

pool 2 'pool-jerasure' erasure size 8 min_size 6 crush_ruleset 2 object_hash rjenkins pg_num 12 pgp_num 12 last_change 64 flags hashpspool stripe_width 4224

#5 Updated by Loïc Dachary over 9 years ago

2014-08-25 13:15:52.293417 7fa8e74b6700 10 osd.1 pg_epoch: 13 pg[1.7s0( v 13'9 lc 11'1 (0'0,13'9] local-les=13 n=4 ec=10 les/c 13/11 12/12/12) [1,6,11,16,4,2,12,9] r=0 lpr=12 pi=10-11/1 luod=11'7 rops=3 crt=11'7 lcod 0'0 mlcod 0'0 active+recovering+degraded m=3] start_read_op: starting ReadOp(tid=55, to_read={c04f7d07/vpm03011797-46/head//1=read_request_t(to_read=[0,1048576], need=2(5),4(4),6(1),11(2),12(6),16(3), want_attrs=1),3dbfd9e7/vpm03011797-31/head//1=read_request_t(to_read=[0,1048576], need=2(5),4(4),6(1),11(2),12(6),16(3), want_attrs=1),133a4af7/vpm03011797-30/head//1=read_request_t(to_read=[0,1048576], need=2(5),4(4),6(1),11(2),12(6),16(3), want_attrs=1)}, complete={}, priority=10, obj_to_source={}, source_to_obj={}, in_progress=)

to_read=[0,1048576] will call offset_len_to_stripe_bounds and fail because 1048576=1024*1024 % stripe_width=4224 != 0

#6 Updated by Loïc Dachary over 9 years ago

In the logs RecoveryOp::IDLE shows

   -86> 2014-08-25 13:15:52.292969 7fa8e74b6700 10 osd.1 pg_epoch: 13 pg[1.7s0( v 13'9 lc 11'1 (0'0,13'9] local-les=13 n=4 ec=10 les/c 13/11 12/12/12) [1,6,11,16,4,2,12,9] r=0 lpr=12 pi=10-11/1 luod=11'7 rops=3 crt=11'7 lcod 0'0 mlcod 0'0 active+recovering+degraded m=3] continue_recovery_op: IDLE return RecoveryOp(hoid=3dbfd9e7/vpm03011797-31/head//1 v=11'5 missing_on=1(0) missing_on_shards=0 recovery_info=ObjectRecoveryInfo(3dbfd9e7/vpm03011797-31/head//1@11'5, copy_subset: [], clone_subset: {}) recovery_progress=ObjectRecoveryProgress(first, data_recovered_to:0, data_complete:false, omap_recovered_to:, omap_complete:true) pending_read=0 obc refcount=0 state=READING waiting_on_pushes= extent_requested=0,1048576

with extent_requested=0,1048576 although
ceph --admin-daemon /var/run/ceph/ceph-mon.a.asok config get osd_recovery_max_chunk
{ "osd_recovery_max_chunk": "8388608"}

and
pool 1 'unique_pool_0' erasure size 8 min_size 6 crush_ruleset 1 object_hash rjenkins pg_num 16 pgp_num 16 last_change 10 flags hashpspool stripe_width 4224

#7 Updated by Loïc Dachary over 9 years ago

ROUND_UP_TO only works with powers of 2.

$ perl -e '$n = 1 * 1024 * 1024 + 1; $d = 4224; $v = ((($n)+($d)-1) & ~(($d)-1)); print "$n => $v\n" ; print $v % $d ; print "\n"'
1048577 => 1048704
1152

#8 Updated by Loïc Dachary over 9 years ago

Teuthology must have changed the default recovery chunk for the OSDs at runtime because

sudo ceph --admin-daemon /var/run/ceph/ceph-osd.2.asok config get osd_recovery_max_chunk
{ "osd_recovery_max_chunk": "1048576"}

#9 Updated by Loïc Dachary over 9 years ago

  • Status changed from 12 to Fix Under Review

With the proposed patch the above workload passes. An inspection of the OSD logs shows that recovery progress such as

2014-08-25 16:22:58.010631 7fd15ef15700 10 osd.1 pg_epoch: 299 pg[1.9s0( v 166'114 (0'0,166'114] local-les=292 n=3 ec=8 les/c 292/236 291/291/259) [1,12,15,5,10,2,11,6] r=0 lpr=291 pi=8-290/17 rops=3 crt=166'114 mlcod 0'0 active+recovering+degraded] continue_recovery_op: WRITING continue RecoveryOp(hoid=11402609/vpm03019476-166/head//1 v=166'114 missing_on=12(1) missing_on_shards=1 recovery_info=ObjectRecoveryInfo(11402609/vpm03019476-166/head//1@166'114, copy_subset: [], clone_subset: {}) recovery_progress=ObjectRecoveryProgress(!first, data_recovered_to:2103552, data_complete:false, omap_recovered_to:, omap_complete:true) pending_read=0 obc refcount=2 state=IDLE waiting_on_pushes= extent_requested=1051776,1051776

now has extent_requested=1051776,1051776 and other similar lines where 1051776 is 1024*1024 rounded to 4224

#10 Updated by Loïc Dachary over 9 years ago

  • % Done changed from 0 to 80
  • Backport set to firefly

#11 Updated by Loïc Dachary over 9 years ago

  • Category set to common
  • Target version set to 0.85 cont.

#13 Updated by Sage Weil over 9 years ago

  • Status changed from Fix Under Review to Pending Backport

#15 Updated by Loïc Dachary over 9 years ago

  • Status changed from Pending Backport to Resolved

#16 Updated by Loïc Dachary over 9 years ago

  • % Done changed from 80 to 100

Also available in: Atom PDF