Bug #9209: osd/ECUtil.h: 66: FAILED assert(offset % stripe_width == 0) - Ceph - Ceph

Actions

Copy link

Bug #9209

closed

osd/ECUtil.h: 66: FAILED assert(offset % stripe_width == 0)

Added by Loïc Dachary over 9 years ago. Updated over 9 years ago.

Status:

Resolved

Priority:

Urgent

Assignee:

Loïc Dachary

Category:

common

Target version:

0.85 cont.

% Done:

100%

Source:

Development

Tags:

Backport:

firefly

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Using

$ ceph --version
ceph version 0.84-562-g8d40600 (8d406001d9b84d9809d181077c61ad9181934752)

The following teuthology job is scheduled

os_type: ubuntu
os_version: '14.04'
nuke-on-error: false
overrides:
  ceph:
    conf:
      global:
        osd heartbeat grace: 40
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
        mon warn on legacy crush tunables: false
      osd:
        debug filestore: 20
        debug journal: 20
        debug ms: 1
        debug osd: 20
    log-whitelist:
    - slow request
    - scrub mismatch
    - ScrubResult
  ceph-deploy:
    branch:
      dev: next
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
        osd default pool size: 2
  install:
    ceph:
      branch: master
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
  - osd.3
- - mon.b
  - mon.c
  - osd.4
  - osd.5
  - osd.6
  - osd.7
- - client.0
  - osd.8
  - osd.9
  - osd.10
  - osd.11
  - osd.12
  - osd.13
  - osd.14
  - osd.15
  - osd.16
  - osd.17
suite_path: /home/loic/software/ceph/ceph-qa-suite
tasks:
- install:
    branch: master
- ceph:
    fs: xfs
- thrashosds:
    chance_pgnum_grow: 1
    chance_pgpnum_fix: 1
    min_in: 10
    timeout: 1200
- rados:
    clients: [client.0]
    ops: 4000
    objects: 500
    ec_pool: true
    erasure_code_profile:
      plugin: jerasure
      k: 6
      m: 2
      technique: reed_sol_van
      ruleset-failure-domain: osd
    op_weights:
      read: 45
      write: 0
      append: 45
      delete: 10

And crashes three OSDs with the following

osd/ECUtil.h: 66: FAILED assert(offset % stripe_width == 0)

 ceph version 0.84-562-g8d40600 (8d406001d9b84d9809d181077c61ad9181934752)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0xb6a24b]
 2: ceph-osd() [0x9ea323]
 3: (ECBackend::start_read_op(int, std::map<hobject_t, ECBackend::read_request_t, std::less<hobject_t>, std::allocator<std::pair<hobject_t const, ECBackend::read_request_t> > >&, std::tr1::shared_ptr<OpRequest>)+0x1019) [0x9f3509]
 4: (ECBackend::dispatch_recovery_messages(RecoveryMessages&, int)+0x624) [0x9f3d54]
 5: (ECBackend::run_recovery_op(PGBackend::RecoveryHandle*, int)+0x2d1) [0x9fb331]
 6: (ReplicatedPG::recover_primary(int, ThreadPool::TPHandle&)+0xaf9) [0x8538c9]
 7: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*, ThreadPool::TPHandle&, int*)+0x54b) [0x885acb]
 8: (OSD::do_recovery(PG*, ThreadPool::TPHandle&)+0x28b) [0x688a8b]
 9: (OSD::RecoveryWQ::_process(PG*, ThreadPool::TPHandle&)+0x17) [0x6e80e7]
 10: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa46) [0xb5b3d6]
 11: (ThreadPool::WorkThread::entry()+0x10) [0xb5c480]
 12: (()+0x8182) [0x7f9fd315f182]
 13: (clone()+0x6d) [0x7f9fd16cb38d]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Which is presumably a side effect of a failure to get the required number of OSDs

1.3    8    0    0    0    0    9691968    34    34    active+clean    2014-08-23 17:44:43.640373    45'34    45:117    [2147483647,15,10,13,6,2147483647,12,14]    15    [2147483647,15,10,13,6,2147483647,12,14]    15    0'0    2014-08-23 17:42:27.682870    0'0    2014-08-23 17:42:27.682870

using the generated ruleset

$ ceph osd crush rule dump unique_pool_0
{ "rule_id": 1,
  "rule_name": "unique_pool_0",
  "ruleset": 1,
  "type": 3,
  "min_size": 3,
  "max_size": 20,
  "steps": [
        { "op": "set_chooseleaf_tries",
          "num": 5},
        { "op": "take",
          "item": -1,
          "item_name": "default"},
        { "op": "choose_indep",
          "num": 0,
          "type": "osd"},
        { "op": "emit"}]}

for a pool of size 8

pool 1 'unique_pool_0' erasure size 8 min_size 6 crush_ruleset 1 object_hash rjenkins pg_num 26 pgp_num 16 last_change 18 flags hashpspool stripe_width 4224
max_osd 18

Files

ceph-osd.1.log (10.3 MB) ceph-osd.1.log

Loïc Dachary, 08/23/2014 11:23 AM

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #9209

osd/ECUtil.h: 66: FAILED assert(offset % stripe_width == 0)

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Sage Weil over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago

Updated by Loïc Dachary over 9 years ago