Project

General

Profile

Bug #24875

OSD: still returning EIO instead of recovering objects on checksum errors

Added by Greg Farnum 5 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
Scrub/Repair
Target version:
-
Start date:
07/11/2018
Due date:
% Done:

0%

Source:
Tags:
Backport:
mimic, luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:

Description

A report came in on the mailing list of an MDS journal which couldn't be read and was throwing errors:

2018-07-11 15:49:14.913771 7efbee672700 -1 log_channel(cluster) log [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b on 10:292cf221:::200.00000000:head

And indeed, when you search for that log message it pops up in PrimaryLogPG::do_read() and do_sparse_read() (and also struct FillInVerifyExtent). When it pops up, the function returns -EIO, and do_osd_ops() (which is the only caller) turns that into a direct client return.
There's a comment "try repair later" which makes me think the author expected the EIO to get turned into a read-repair, but tracing back through git history there's no indication of any work done to enable that in this path.


Related issues

Related to RADOS - Bug #25084: Attempt to read object that can't be repaired loops forever Resolved 07/24/2018
Copied to RADOS - Backport #25226: mimic: OSD: still returning EIO instead of recovering objects on checksum errors Resolved
Copied to RADOS - Backport #25227: luminous: OSD: still returning EIO instead of recovering objects on checksum errors Resolved

History

#1 Updated by David Zafman 5 months ago

The do_sparse_read() path doesn't attempt to repair a checksum error. Could that be the real issue?

The do_read() path looks fine since it calls rep_repair_primary_object() whether the EIO came from the disk or the crc check.

      if (oi.data_digest != crc) {
        osd->clog->error() << info.pgid << std::hex
                           << " full-object read crc 0x" << crc
                           << " != expected 0x" << oi.data_digest
                           << std::dec << " on " << soid;
        r = -EIO; // try repair later
      }
    }
    if (r == -EIO) {
      r = rep_repair_primary_object(soid, ctx->op);
    }

#2 Updated by Greg Farnum 5 months ago

Ah, the error was reported on luminous, which doesn't do the repair, and I guess I missed it on master. Sorry for the mis-diagnosis.

(Looks like the MDS doesn't use sparse-read, but we should definitely still fix that path too!)

#3 Updated by Josh Durgin 5 months ago

  • Priority changed from Normal to High

#4 Updated by Dan van der Ster 5 months ago

Is this the relevant fix? https://github.com/ceph/ceph/commit/4667280f8afe6cd68dfffea61d7530581f3dd0eb

Alessandro's OSDs are bluestore, and he doesn't get any bluestore block checksum errors. So the crc can be wrong when the bluestore data is correct?

Will the above patch correct this type of crc error?

Also, we deep-scrubbed the PG and it didn't find any inconsistent objects.

#5 Updated by Dan van der Ster 5 months ago

FTR, this crc issue is probably due to an incomplete backport to 12.2.6 of the skip_digest changes for bluestore:

[12:56:59]  <dvanders> regarding the 12.2.6 cephfs crc errors, could it be `b519a0b1c1 osd/PrimaryLogPG: do not generate data digest for BlueStore by default`  or one of the other omap/data_digest changes that landed in 12.2.6 ?
[12:59:29]  <dvanders> Seems similar to https://tracker.ceph.com/issues/23871 ... which was fixed in mimic but not luminous: `fe5038c7f9 osd/PrimaryLogPG: clear data digest on WRITEFULL if skip_data_digest`
[14:10:16]  <sage>    dvanders: yeah does seem similar, but i'm not sure why it would manifest during a .5 to .6 upgrade.  looking...
[14:10:37]  <sage>    dvanders: it's definitely bluestore-only?
[14:11:20]  <dvanders>    i didn't try filestore, but all the clusters i've seen were bluestore
[14:14:56]  <sage>    ah, it's becaues teh other skip_digest handlng code was just backporting/changed
[14:14:59]  <sage>    backported/changed

This issue is related to https://tracker.ceph.com/issues/23871

#6 Updated by David Zafman 5 months ago

  • Related to Bug #25084: Attempt to read object that can't be repaired loops forever added

#7 Updated by David Zafman 5 months ago

  • Status changed from New to Verified
  • Assignee set to David Zafman

#8 Updated by David Zafman 5 months ago

  • Backport set to mimic, luminous

#9 Updated by David Zafman 5 months ago

  • Copied to Backport #25226: mimic: OSD: still returning EIO instead of recovering objects on checksum errors added

#10 Updated by David Zafman 5 months ago

  • Copied to Backport #25227: luminous: OSD: still returning EIO instead of recovering objects on checksum errors added

#12 Updated by David Zafman 5 months ago

  • Status changed from Verified to In Progress

#13 Updated by Kefu Chai 4 months ago

  • Status changed from In Progress to Pending Backport

#14 Updated by Nathan Cutler 4 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF