Bug #62487
openrgw-multisite: few objects are duplicated on archive zone intermittently
0%
Description
haven't been able to reproduce reliably, but archive zone still creates multiple versions of an object with same mtime.
recently we merged https://github.com/ceph/ceph/pull/50841 that fixes the most basic case of duplication.
this one needs more investigation.
Updated by Shilpa MJ 6 months ago
1. create three zones with one of them being archive zone
2. stop rgw service on archive zone
3. create bucket and upload objects on one of the other two zones
4. let sync catch up
5. bring up the rgw service on archive zone
6. wait for sync status to show as caught up
the archive zone ends up with a few objects having two versions.
from my analysis, it appears that the difference between the objects that have two versions and the ones that don't is the timing. Archive zone C fetches object from both zone A and zone B after a full sync bucket listing. if the object coming from zone A is still being written, and we haven't yet created an instance and updated the olh head object as part of write_meta(), then the second request we will still be reading the old object state and continue with creating a new instance.