Project

General

Profile

Actions

Bug #4791

closed

osd/ReplicatedPG.cc: 7053: FAILED assert(r >= 0) in scan_range

Added by Sage Weil almost 11 years ago. Updated almost 11 years ago.

Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

     0> 2013-04-22 15:11:45.138245 7f946bfff700 -1 osd/ReplicatedPG.cc: In function 'void ReplicatedPG::scan_range(hobject_t, int, int, PG::BackfillInterval*)' thread 7f946bfff700 time 2013-04-22 15:11:45.136128
osd/ReplicatedPG.cc: 7053: FAILED assert(r >= 0)

 ceph version 0.60-598-g70e1e47 (70e1e47da0970a4ad5cdd311aaaebf45361d5dff)
 1: (ReplicatedPG::scan_range(hobject_t, int, int, PG::BackfillInterval*)+0xde1) [0x59d241]
 2: (ReplicatedPG::recover_backfill(int)+0x644) [0x5ce874]
 3: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*)+0x82d) [0x5d347d]
 4: (OSD::do_recovery(PG*)+0x3a5) [0x633f95]
 5: (OSD::RecoveryWQ::_process(PG*)+0x11) [0x672921]
 6: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x835cd6]
 7: (ThreadPool::WorkThread::entry()+0x10) [0x837b00]
 8: (()+0x7e9a) [0x7f949b67ce9a]
 9: (clone()+0x6d) [0x7f9499a7ccbd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

job was
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-04-22_13:41:44-rados-next-testing-basic/111$ cat orig.config.yaml 
kernel:
  kdb: true
  sha1: e7fce312d4931de1896feb5392e69e2b0b6a5b92
nuke-on-error: true
overrides:
  ceph:
    conf:
      global:
        ms inject delay max: 1
        ms inject delay probability: 0.005
        ms inject delay type: osd
        ms inject socket failures: 2500
    fs: ext4
    log-whitelist:
    - slow request
    sha1: 70e1e47da0970a4ad5cdd311aaaebf45361d5dff
  s3tests:
    branch: next
  workunit:
    sha1: 70e1e47da0970a4ad5cdd311aaaebf45361d5dff
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
  - client.0
tasks:
- chef: null
- clock: null
- install: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    chance_pgnum_grow: 1
    chance_pgpnum_fix: 1
    timeout: 1200
- rados:
    clients:
    - client.0
    objects: 50
    op_weights:
      delete: 50
      read: 100
      rollback: 50
      snap_create: 50
      snap_remove: 50
      write: 100
    ops: 4000

Actions #1

Updated by Samuel Just almost 11 years ago

  • Status changed from New to Need More Info
  • Priority changed from Urgent to High

This may be an ext4 bug, I suggest we ignore it until we see it again on xfs. I've removed ext4 from the rados and rbd thrashing tests for now since it has caused other problems and is not actually recommended as a ceph-osd backend.

Actions #2

Updated by Sage Weil almost 11 years ago

  • Status changed from Need More Info to Can't reproduce
Actions

Also available in: Atom PDF