Actions
Feature #47718
closedintoduce means to detect/workaround spurios read errors in bluefs
% Done:
0%
Source:
Tags:
Backport:
Description
We've seen and work around by retry such errors for user data at main device, see https://tracker.ceph.com/issues/22464
But DB volume (and hence RocksDB) is still exposed to this issue. And we're hard to properly diagnose this.
Hence suggesting to introduce a detector triggered on all-zeros read blocks and unconditionally retrying such reads. If all-zeros pattern doesn't persist this indicate the same spurious read error. The detector to be disabled by default and to be turned on when suspicious read errors are observed.
Unfortunately it looks like the only available solution so far which requires no updates to RocksDB...
Actions