Attempt to read object that can't be repaired loops forever
If all replicas are of an object are bad causes a loop of continuous recovery and calls to rep_repair_primary_object(). I've reproduced this by making the object data_digest mismatch the object_info_t data_digest.
I saw this in qa/standalone/scrub/osd-scrub-repair.sh TEST_corrupt_scrub_replicated() ROBJ17 if you try to read it BEFORE the repair. The current code verifies the repair by reading it afterwards.
#2 Updated by David Zafman 11 months ago
- Status changed from New to In Progress
What I actually ran into is that when do_read() fails because of the CRC mismatch, the recovery repair can pull from an OSD that also has a mismatched CRC. The recovery doesn't appear to have failed, because the pushing peer and the primary don't check the CRC.
One way to fix this, is to check the CRC when possible in build_push_op(). If it doesn't match, then return an EIO.
Another fix would be to check the CRC at the primary when receiving the read data.
My concern is in the situation what the object info data_digest is inconsistent on all replicas, we would be unable to copy such an object off of the current set of OSDs. This inconsistency happened due to a bug in Luminous 10.2.5.
#4 Updated by David Zafman 11 months ago
- Backport set to See comment
I don't think we should backport this change. In Luminous and possibly upgraded to Mimic there is a possibility that object info data_digest mismatches in a mixed filestore/bluestore cluster. In that case this change will cause a PG to go to recovery_unfound state and not be clean after a read CRC error is found.