Feature #40021: Zero out pending sectors during PG repair - bluestore - Ceph

Actions

Copy link

Feature #40021

open

Zero out pending sectors during PG repair

Added by Dan van der Ster almost 5 years ago.

Status:

New

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

Community (user)

Tags:

Backport:

Reviewed:

Affected Versions:

Pull request ID:

Description

In our production clusters we often see inconsistent objects which are the result of a SCSI Medium Error during a read from rotational media.

Checking with a WD engineer, these errors are usually a symptom of weak writes, caused by vibrations, etc. when the sector was written. The hdd is usually not bad per se, and those sectors may also still be good and usable.

SMART shows the number of potentially bad sectors with the Current_Pending_Sector attribute (ID 197). When one of these pending sectors is written to again, it may succeed (decrementing Current_Pending_Sector) or if the sector really is bad it will be reallocated (attribute Reallocated_Sector_Ct).

The feature request is that the OSD should attempt to zero out the previous locations of an inconsistent object during PG repair. Chatting with Sage about this, bluestore should write the repaired object to a new sector on disk (as it does now, using the bluestore allocator), but a new function could afterwards zero the ranges on disk where the weakly written copy of the object had been located.

The benefit of this feature would be that we will better understand the condition of an hdd, and be able to better predict when a disk really is failing. It would also save a lot of time compared with current procedures: inconsistent object -> pending sector -> drain OSDs -> `dd if=/dev/zero of=/dev/sdX` -> recreate OSD.

No data to display

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » bluestore

Custom queries

Feature #40021

Zero out pending sectors during PG repair