Project

General

Profile

Feature #47718

intoduce means to detect/workaround spurios read errors in bluefs

Added by Igor Fedotov 9 months ago. Updated 11 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

We've seen and work around by retry such errors for user data at main device, see https://tracker.ceph.com/issues/22464

But DB volume (and hence RocksDB) is still exposed to this issue. And we're hard to properly diagnose this.
Hence suggesting to introduce a detector triggered on all-zeros read blocks and unconditionally retrying such reads. If all-zeros pattern doesn't persist this indicate the same spurious read error. The detector to be disabled by default and to be turned on when suspicious read errors are observed.

Unfortunately it looks like the only available solution so far which requires no updates to RocksDB...


Related issues

Related to bluestore - Bug #47271: ceph version 14.2.10-OSD fails New

History

#1 Updated by Igor Fedotov 9 months ago

  • Related to Bug #47271: ceph version 14.2.10-OSD fails added

#2 Updated by Igor Fedotov 11 days ago

  • Status changed from New to Resolved
  • Target version set to v17.0.0
  • Pull request ID set to 39185

Also available in: Atom PDF