Project

General

Profile

Actions

Bug #12615

open

Repair of Erasure Coded pool with an unrepairable object causes pg state to lose clean state

Added by David Zafman over 8 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
David Zafman
Category:
EC Pools
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

After an erasure coded pool 2 + 1 with 2 chunks of a single object corrupted, doing a repair which can't succeed causes pg to lose clean state. The result of a unclean pg is that operations hang
and trying to repair again just causes scrub to requeue continuously. The EIO from rados get requires wip-12000-12200 branch changes.

$ rados -p ecpool get foo dz.out3
error getting ecpool/foo: (5) Input/output error
$ ./ceph pg dump pgs | grep ^3.6
dumped pgs in format plain
3.6     1       0       0       0       0       1048576 1       1       active+clean    2015-08-04 16:14:41.607821      16'1    16:8    [0,1,2] 0       [0,1,2] 0       0'0     2015-08-04 16:14:40.526211      0'0     2015-08-04 16:14:40.526211
$ ceph pg repair 3.6
instructing pg 3.6 on osd.0 to repair
$ ceph pg dump pgs | grep ^3.6
dumped pgs in format plain
3.6     1       1       4       0       1       1048576 1       1       active  2015-08-04 16:15:39.659583      16'1    16:10   [0,1,2] 0       [0,1,2] 0       16'1    2015-08-04 16:15:39.659434      16'1    2015-08-04 16:15:39.659434
[~/ceph/src] (wip-12000-12200-new)
$ ./rados -p ecpool get foo dz.out3
^C

To get to active+clean, I removed the broken object from the filestore and restarted the osd.


Related issues 1 (0 open1 closed)

Related to RADOS - Bug #25084: Attempt to read object that can't be repaired loops foreverResolvedDavid Zafman07/24/2018

Actions
Actions #1

Updated by David Zafman over 8 years ago

  • Description updated (diff)
Actions #2

Updated by David Zafman over 8 years ago

  • Description updated (diff)
Actions #3

Updated by Sage Weil over 8 years ago

  • Assignee set to David Zafman
Actions #4

Updated by David Zafman over 8 years ago

  • Status changed from New to 12

The cause of this is as follows:

PG::scrub_process_inconsistent() clears PG_STATE_CLEAN in repair because a later DoRecovery will be initiated to fix the bad replicas/shards. This code works perfectly for replicated pools because we only consider objects that have an authoritative copy. With erasure coded pools we need to have enough shards to repair. Another issue is that we consider these objects as "fixed" even though we won't know later if the repair worked. I don't like this just in case an OSD goes down and the repair can't proceed.

scrub_finish() outputs a message like "1.6 repair 2 errors, 2 fixed" even though nothing has been fixed yet. It initiates a DoRecovery() state change for the PG.

Normally the PG will be marked CLEAN at the end of the recovery process when normally all the repairs happen.

Actions #5

Updated by Greg Farnum almost 7 years ago

  • Project changed from Ceph to RADOS
  • Category set to EC Pools
  • Component(RADOS) OSD added

David, is this still an issue?

Actions #6

Updated by David Zafman almost 7 years ago

This will be fixed when we move repair out of the OSD. We shouldn't be using recovery to do repair anyway.

Actions #7

Updated by David Zafman over 5 years ago

  • Related to Bug #25084: Attempt to read object that can't be repaired loops forever added
Actions #8

Updated by David Zafman over 5 years ago

In a replicated case which in which all copies are bad, a rep_repair_primary_object() can cause loss of clean and instead be in recovery_unfound.

Actions #9

Updated by Patrick Donnelly over 4 years ago

  • Status changed from 12 to New
Actions

Also available in: Atom PDF