Project

General

Profile

Actions

Feature #9943

open

osd: mark pg and use replica on EIO from client read

Added by Guang Yang over 9 years ago. Updated over 9 years ago.

Status:
In Progress
Priority:
Normal
Assignee:
Wei Luo
Category:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

Copy the below email thread and open an issue to track the enhancement.

Date: Wed, 29 Oct 2014 08:11:01 -0700
From: sage@newdream.net
To: yguang11@outlook.com
CC: ceph-devel@vger.kernel.org
Subject: Re: OSD crashed due to filestore EIO

On Wed, 29 Oct 2014, GuangYang wrote:
> Recently we observed an OSD crash due to file corruption in filesystem,
> which leads to an assertion failure at FileStore::read as EIO is not
> tolerated. As file corruption is normal in large deployment, I am
> thinking if that behavior is too aggressive, especially for EC pool.
>
> After searching, I found this flag might help : filestore_fail_eio,
> which can make the OSD survive an EIO failure, it is true by default
> though. I haven't tested it yet.

 That will reove the immediate assert. Currently, for an object being read
 by a client, it will just pass EIO back to the client, though, which is
 clearly not what we want.

> Does it make sense to adjust the behavior a little bit, if the filestore
> read fail due to file corruption, return back the failure and at the
> same time mark the PG as inconsistent, due the redundancy (replication
> or EC), the request can still be served, and at the same time, we can
> get alert saying there is inconsistency and manually trigger a PG
> repair?

 That would be ideal, yeah. I think that initially it makes sense to doing
 *just that read* via a replica but letting the admin trigger the repair.
 This most closely mirrors what scrub currently does on EIO (mark
 inconsistent but let admin repair). Later, when we support automatic
 repair, that option can affect both scrub and client-triggered EIOs?

 We just need to be careful that any EIO on *metadata* still triggers a
 failure as we need to be especially careful about handling that. IIRC
 there is a flag passed to read indicating whether EIO is okay; we should
 probably use that so that EIO-ok vs EIO-notok cases are still clearly
 annotated.


Related issues 1 (0 open1 closed)

Related to Ceph - Bug #8588: In the erasure-coded pool, primary OSD will crash at decoding if any data chunk's size is changed Duplicate06/11/2014

Actions
Actions

Also available in: Atom PDF