Actions
Bug #503
closedosd: query osds since last_epoch_clean before concluding objects lost?
% Done:
0%
Source:
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
We currently query prior_set osds through last_epoch_started. This gives us teh latest log and version. But if we are missing objects, and prior_set_down is empty, we conclude they're lost. That's not quite right. Peering could have completed at last_epoch_started, but recovery didn't, so some osds from before that have the objects in question. If they are temporarily down or slow sending their stray Info during peering, we could incorrect "give up" and conclude the objects are gone.
We probably need to query them in that lower part of peer(). And/or add them to prior_set_down if they are down at that point? Or maybe they should just be part of the prior_set, as that makes all the prior_ste_affected etc. checks apply.
Updated by Sage Weil over 13 years ago
- Target version changed from v0.23 to v0.24
Updated by Sage Weil over 13 years ago
- Estimated time set to 3:00 h
- Source set to 1
Actions