Bug #4254: osd: failure to recover before timeout on rados bench and thrashing; negative stats - Ceph - Ceph

Actions

Copy link

Bug #4254

closed

osd: failure to recover before timeout on rados bench and thrashing; negative stats

Added by Sage Weil about 11 years ago. Updated about 9 years ago.

Status:

Resolved

Priority:

High

Assignee:

Guang Yang

Category:

OSD

Target version:

% Done:

Source:

Q/A

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

snippet:

2013-02-23T10:50:34.510 INFO:teuthology.task.thrashosds.ceph_manager:   health HEALTH_WARN 1 pgs backfilling; 1 pgs stuck unclean; recovery -706/10675 degraded (-6.614%);  recovering 0 o/s, 1947KB/s
   monmap e1: 3 mons at {a=10.214.132.28:6789/0,b=10.214.132.27:6789/0,c=10.214.132.28:6790/0}, election epoch 6, quorum 0,1,2 a,b,c
   osdmap e170: 6 osds: 6 up, 5 in
    pgmap v2442: 72 pgs: 71 active+clean, 1 active+remapped+backfilling; 20592 MB data, 43484 MB used, 2746 GB / 2794 GB avail; 0B/s wr, 4op/s; -706/10675 degraded (-6.614%);  recovering 0 o/s, 1947KB/s
   mdsmap e5: 1/1/1 up {0=a=up:active}

job was

ubuntu@teuthology:/a/sage-2013-02-23_08:44:35-regression-master-testing-basic/10262$ cat orig.config.yaml 
kernel:
  kdb: true
  sha1: 92a49fb0f79f3300e6e50ddf56238e70678e4202
nuke-on-error: true
overrides:
  ceph:
    conf:
      global:
        ms inject socket failures: 5000
      osd:
        osd op thread timeout: 60
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: 704db850131643b26bafe6594946cacce483c171
  s3tests:
    branch: master
  workunit:
    sha1: 704db850131643b26bafe6594946cacce483c171
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
  - client.0
tasks:
- chef: null
- clock: null
- install: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- radosbench:
    clients:
    - client.0
    time: 1800

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph

Custom queries

Bug #4254

osd: failure to recover before timeout on rados bench and thrashing; negative stats

Updated by Ian Colle about 11 years ago

Updated by Tamilarasi muthamizhan about 11 years ago

Updated by Tamilarasi muthamizhan about 11 years ago

Updated by Tamilarasi muthamizhan about 11 years ago

Updated by Ian Colle about 11 years ago

Updated by Ian Colle about 11 years ago

Updated by Samuel Just about 11 years ago

Updated by Samuel Just about 11 years ago

Updated by Noah Watkins over 10 years ago

Updated by Zhi Zhang over 9 years ago

Updated by Guang Yang about 9 years ago

Updated by Guang Yang about 9 years ago

Updated by Guang Yang about 9 years ago

Updated by Samuel Just about 9 years ago

Updated by Sage Weil about 9 years ago

Updated by Guang Yang about 9 years ago