Project

General

Profile

Actions

Bug #4217

closed

osd: recovery hangs indefinitely

Added by Sage Weil about 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

this has been popping up in qa and mistakenly interpreted as just slow, but recovery is in fact blocking indefinitely.
The job is

interactive-on-error: true
overrides:
  ceph:
    conf:
      global:
        ms inject socket failures: 5000
      osd:
        debug osd: 20
        debug filestore: 20
        debug ms: 1
    fs: ext4
    log-whitelist:
    - slow request
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
  - client.0
tasks:
- chef: null
- clock: null
- install: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- ceph-fuse: null
- workunit:
    clients:
      client.0:
      - rados/test.sh

and it is currently in the stuck state on plana18.

Actions #1

Updated by Sage Weil about 11 years ago

  • Assignee set to Sage Weil
Actions #2

Updated by Sage Weil about 11 years ago

  • Status changed from 12 to Fix Under Review

wip-4217 now passes the test

Actions #3

Updated by Sage Weil about 11 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF