Bug #24875
closed
OSD: still returning EIO instead of recovering objects on checksum errors
Added by Greg Farnum almost 6 years ago.
Updated over 5 years ago.
Description
A report came in on the mailing list of an MDS journal which couldn't be read and was throwing errors:
2018-07-11 15:49:14.913771 7efbee672700 -1 log_channel(cluster) log [ERR] : 10.14 full-object read crc 0x976aefc5 != expected 0x9ef2b41b on 10:292cf221:::200.00000000:head
And indeed, when you search for that log message it pops up in PrimaryLogPG::do_read() and do_sparse_read() (and also struct FillInVerifyExtent). When it pops up, the function returns -EIO, and do_osd_ops() (which is the only caller) turns that into a direct client return.
There's a comment "try repair later" which makes me think the author expected the EIO to get turned into a read-repair, but tracing back through git history there's no indication of any work done to enable that in this path.
The do_sparse_read() path doesn't attempt to repair a checksum error. Could that be the real issue?
The do_read() path looks fine since it calls rep_repair_primary_object() whether the EIO came from the disk or the crc check.
if (oi.data_digest != crc) {
osd->clog->error() << info.pgid << std::hex
<< " full-object read crc 0x" << crc
<< " != expected 0x" << oi.data_digest
<< std::dec << " on " << soid;
r = -EIO; // try repair later
}
}
if (r == -EIO) {
r = rep_repair_primary_object(soid, ctx->op);
}
Ah, the error was reported on luminous, which doesn't do the repair, and I guess I missed it on master. Sorry for the mis-diagnosis.
(Looks like the MDS doesn't use sparse-read, but we should definitely still fix that path too!)
- Priority changed from Normal to High
FTR, this crc issue is probably due to an incomplete backport to 12.2.6 of the skip_digest changes for bluestore:
[12:56:59] <dvanders> regarding the 12.2.6 cephfs crc errors, could it be `b519a0b1c1 osd/PrimaryLogPG: do not generate data digest for BlueStore by default` or one of the other omap/data_digest changes that landed in 12.2.6 ?
[12:59:29] <dvanders> Seems similar to https://tracker.ceph.com/issues/23871 ... which was fixed in mimic but not luminous: `fe5038c7f9 osd/PrimaryLogPG: clear data digest on WRITEFULL if skip_data_digest`
[14:10:16] <sage> dvanders: yeah does seem similar, but i'm not sure why it would manifest during a .5 to .6 upgrade. looking...
[14:10:37] <sage> dvanders: it's definitely bluestore-only?
[14:11:20] <dvanders> i didn't try filestore, but all the clusters i've seen were bluestore
[14:14:56] <sage> ah, it's becaues teh other skip_digest handlng code was just backporting/changed
[14:14:59] <sage> backported/changed
This issue is related to https://tracker.ceph.com/issues/23871
- Related to Bug #25084: Attempt to read object that can't be repaired loops forever added
- Status changed from New to 12
- Assignee set to David Zafman
- Backport set to mimic, luminous
- Copied to Backport #25226: mimic: OSD: still returning EIO instead of recovering objects on checksum errors added
- Copied to Backport #25227: luminous: OSD: still returning EIO instead of recovering objects on checksum errors added
- Status changed from 12 to In Progress
- Status changed from In Progress to Pending Backport
- Status changed from Pending Backport to Resolved
Also available in: Atom
PDF