Bug #4791: osd/ReplicatedPG.cc: 7053: FAILED assert(r >= 0) in scan_range - Ceph - Ceph

Actions

Copy link

Bug #4791

closed

osd/ReplicatedPG.cc: 7053: FAILED assert(r >= 0) in scan_range

Added by Sage Weil almost 11 years ago. Updated almost 11 years ago.

Status:

Can't reproduce

Priority:

High

Assignee:

Category:

OSD

Target version:

% Done:

Source:

Q/A

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

     0> 2013-04-22 15:11:45.138245 7f946bfff700 -1 osd/ReplicatedPG.cc: In function 'void ReplicatedPG::scan_range(hobject_t, int, int, PG::BackfillInterval*)' thread 7f946bfff700 time 2013-04-22 15:11:45.136128
osd/ReplicatedPG.cc: 7053: FAILED assert(r >= 0)

 ceph version 0.60-598-g70e1e47 (70e1e47da0970a4ad5cdd311aaaebf45361d5dff)
 1: (ReplicatedPG::scan_range(hobject_t, int, int, PG::BackfillInterval*)+0xde1) [0x59d241]
 2: (ReplicatedPG::recover_backfill(int)+0x644) [0x5ce874]
 3: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*)+0x82d) [0x5d347d]
 4: (OSD::do_recovery(PG*)+0x3a5) [0x633f95]
 5: (OSD::RecoveryWQ::_process(PG*)+0x11) [0x672921]
 6: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x835cd6]
 7: (ThreadPool::WorkThread::entry()+0x10) [0x837b00]
 8: (()+0x7e9a) [0x7f949b67ce9a]
 9: (clone()+0x6d) [0x7f9499a7ccbd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

job was

ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-04-22_13:41:44-rados-next-testing-basic/111$ cat orig.config.yaml 
kernel:
  kdb: true
  sha1: e7fce312d4931de1896feb5392e69e2b0b6a5b92
nuke-on-error: true
overrides:
  ceph:
    conf:
      global:
        ms inject delay max: 1
        ms inject delay probability: 0.005
        ms inject delay type: osd
        ms inject socket failures: 2500
    fs: ext4
    log-whitelist:
    - slow request
    sha1: 70e1e47da0970a4ad5cdd311aaaebf45361d5dff
  s3tests:
    branch: next
  workunit:
    sha1: 70e1e47da0970a4ad5cdd311aaaebf45361d5dff
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
  - client.0
tasks:
- chef: null
- clock: null
- install: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    chance_pgnum_grow: 1
    chance_pgpnum_fix: 1
    timeout: 1200
- rados:
    clients:
    - client.0
    objects: 50
    op_weights:
      delete: 50
      read: 100
      rollback: 50
      snap_create: 50
      snap_remove: 50
      write: 100
    ops: 4000

Actions

Copy link

Updated by Samuel Just almost 11 years ago

Status changed from New to Need More Info
Priority changed from Urgent to High

This may be an ext4 bug, I suggest we ignore it until we see it again on xfs. I've removed ext4 from the rados and rbd thrashing tests for now since it has caused other problems and is not actually recommended as a ceph-osd backend.

Actions

Copy link

Updated by Sage Weil almost 11 years ago

Status changed from Need More Info to Can't reproduce

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #4791

osd/ReplicatedPG.cc: 7053: FAILED assert(r >= 0) in scan_range

Updated by Samuel Just almost 11 years ago

Updated by Sage Weil almost 11 years ago