Bug #19486: Rebalancing can propagate corrupt copy of replicated object - RADOS - Ceph

Actions

Copy link

Bug #19486

open

Rebalancing can propagate corrupt copy of replicated object

Added by Mark Houghton about 7 years ago. Updated over 4 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

Backfill/Recovery

Target version:

Ceph - v10.2.7

% Done:

Source:

Community (user)

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v10.2.6

ceph-qa-suite:

Component(RADOS):

OSD

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

With 4 OSDs in a replication pool, with the replication count set to 3, I stored an object and found copies on osd0, osd1 and osd3.

I manually changed the primary copy (on osd0) to simulate corruption.

osd0 - corrupt copy (primary)
osd1 - good copy
osd2 -
osd3 - good copy

After that, I did "ceph osd out 3", taking out one of the good replicas, and waited for Ceph to rebalance. After that, I had copies on osd0, osd1 and osd2 as expected.

I had hoped that Ceph would have chosen the good copy as the canonical replica. Instead, it chose the corrupted primary copy, and created the new copy from that. So I ended up with:

osd0 - corrupt copy
osd1 - good copy
osd2 - corrupt copy
osd3 - out

Now I have two corrupt copies when previously I had only one. If Ceph rebalances again before anyone notices the corruption and repairs it, I could well end up with 3 corrupt copies.

If I run a scrub and a repair, Ceph correctly identifies the corrupt copies (as shown by "data_digest_mismatch" in the output from "rados list-inconsistent-obj") and restores them from the single good copy. Rebalancing should do a similar integrity check of each copy before choosing one as a canonical copy when rebalancing.

Actions

Copy link

Updated by Sage Weil about 7 years ago

Status changed from New to 12

Yes. The new scrub tools (in progress) will give you more control over which copy is propagated. And bluestore's checksums will make it clear which one is bad. Until then, there isn't much to be done here!

Actions

Copy link

Updated by Mark Houghton about 7 years ago

Thanks. I thought it might be the case that Bluestore would fix or improve this, but I haven't found a way to test that because I'm not sure how to simulate corrupting one copy of an object in Bluestore - I can't just edit the file when there's no filesystem. Can you confirm what Ceph would do if using Bluestore in this situation?

Are there any tickets I can track for the new scrub tools you mentioned?

Actions

Copy link