Bug #12615: Repair of Erasure Coded pool with an unrepairable object causes pg state to lose clean state - RADOS - Ceph

Actions

Copy link

Bug #12615

open

Repair of Erasure Coded pool with an unrepairable object causes pg state to lose clean state

Added by David Zafman almost 9 years ago. Updated over 4 years ago.

Status:

New

Priority:

Normal

Assignee:

David Zafman

Category:

EC Pools

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(RADOS):

OSD

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

After an erasure coded pool 2 + 1 with 2 chunks of a single object corrupted, doing a repair which can't succeed causes pg to lose clean state. The result of a unclean pg is that operations hang
and trying to repair again just causes scrub to requeue continuously. The EIO from rados get requires wip-12000-12200 branch changes.

$ rados -p ecpool get foo dz.out3
error getting ecpool/foo: (5) Input/output error
$ ./ceph pg dump pgs | grep ^3.6
dumped pgs in format plain
3.6     1       0       0       0       0       1048576 1       1       active+clean    2015-08-04 16:14:41.607821      16'1    16:8    [0,1,2] 0       [0,1,2] 0       0'0     2015-08-04 16:14:40.526211      0'0     2015-08-04 16:14:40.526211
$ ceph pg repair 3.6
instructing pg 3.6 on osd.0 to repair
$ ceph pg dump pgs | grep ^3.6
dumped pgs in format plain
3.6     1       1       4       0       1       1048576 1       1       active  2015-08-04 16:15:39.659583      16'1    16:10   [0,1,2] 0       [0,1,2] 0       16'1    2015-08-04 16:15:39.659434      16'1    2015-08-04 16:15:39.659434
[~/ceph/src] (wip-12000-12200-new)
$ ./rados -p ecpool get foo dz.out3
^C

To get to active+clean, I removed the broken object from the filestore and restarted the osd.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » RADOS

Custom queries

Bug #12615

Repair of Erasure Coded pool with an unrepairable object causes pg state to lose clean state

Updated by David Zafman almost 9 years ago

Updated by David Zafman almost 9 years ago

Updated by Sage Weil over 8 years ago

Updated by David Zafman over 8 years ago

Updated by Greg Farnum almost 7 years ago

Updated by David Zafman almost 7 years ago

Updated by David Zafman over 5 years ago

Updated by David Zafman over 5 years ago

Updated by Patrick Donnelly over 4 years ago