Actions
Bug #4791
closedosd/ReplicatedPG.cc: 7053: FAILED assert(r >= 0) in scan_range
Status:
Can't reproduce
Priority:
High
Assignee:
-
Category:
OSD
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
0> 2013-04-22 15:11:45.138245 7f946bfff700 -1 osd/ReplicatedPG.cc: In function 'void ReplicatedPG::scan_range(hobject_t, int, int, PG::BackfillInterval*)' thread 7f946bfff700 time 2013-04-22 15:11:45.136128 osd/ReplicatedPG.cc: 7053: FAILED assert(r >= 0) ceph version 0.60-598-g70e1e47 (70e1e47da0970a4ad5cdd311aaaebf45361d5dff) 1: (ReplicatedPG::scan_range(hobject_t, int, int, PG::BackfillInterval*)+0xde1) [0x59d241] 2: (ReplicatedPG::recover_backfill(int)+0x644) [0x5ce874] 3: (ReplicatedPG::start_recovery_ops(int, PG::RecoveryCtx*)+0x82d) [0x5d347d] 4: (OSD::do_recovery(PG*)+0x3a5) [0x633f95] 5: (OSD::RecoveryWQ::_process(PG*)+0x11) [0x672921] 6: (ThreadPool::worker(ThreadPool::WorkThread*)+0x4e6) [0x835cd6] 7: (ThreadPool::WorkThread::entry()+0x10) [0x837b00] 8: (()+0x7e9a) [0x7f949b67ce9a] 9: (clone()+0x6d) [0x7f9499a7ccbd] NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
job was
ubuntu@teuthology:/var/lib/teuthworker/archive/teuthology-2013-04-22_13:41:44-rados-next-testing-basic/111$ cat orig.config.yaml kernel: kdb: true sha1: e7fce312d4931de1896feb5392e69e2b0b6a5b92 nuke-on-error: true overrides: ceph: conf: global: ms inject delay max: 1 ms inject delay probability: 0.005 ms inject delay type: osd ms inject socket failures: 2500 fs: ext4 log-whitelist: - slow request sha1: 70e1e47da0970a4ad5cdd311aaaebf45361d5dff s3tests: branch: next workunit: sha1: 70e1e47da0970a4ad5cdd311aaaebf45361d5dff roles: - - mon.a - mon.c - osd.0 - osd.1 - osd.2 - - mon.b - mds.a - osd.3 - osd.4 - osd.5 - client.0 tasks: - chef: null - clock: null - install: null - ceph: log-whitelist: - wrongly marked me down - objects unfound and apparently lost - thrashosds: chance_pgnum_grow: 1 chance_pgpnum_fix: 1 timeout: 1200 - rados: clients: - client.0 objects: 50 op_weights: delete: 50 read: 100 rollback: 50 snap_create: 50 snap_remove: 50 write: 100 ops: 4000
Updated by Samuel Just about 11 years ago
- Status changed from New to Need More Info
- Priority changed from Urgent to High
This may be an ext4 bug, I suggest we ignore it until we see it again on xfs. I've removed ext4 from the rados and rbd thrashing tests for now since it has caused other problems and is not actually recommended as a ceph-osd backend.
Updated by Sage Weil almost 11 years ago
- Status changed from Need More Info to Can't reproduce
Actions