Project

General

Profile

Actions

Bug #62487

open

rgw-multisite: few objects are duplicated on archive zone intermittently

Added by Shilpa MJ 9 months ago. Updated 5 months ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
rgw-multisite-backlog
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

haven't been able to reproduce reliably, but archive zone still creates multiple versions of an object with same mtime.
recently we merged https://github.com/ceph/ceph/pull/50841 that fixes the most basic case of duplication.
this one needs more investigation.

Actions #1

Updated by Shilpa MJ 9 months ago

  • Subject changed from few objects are duplicated on archive zone intermittently to rgw-multisite: few objects are duplicated on archive zone intermittently
Actions #2

Updated by Shilpa MJ 6 months ago

1. create three zones with one of them being archive zone
2. stop rgw service on archive zone
3. create bucket and upload objects on one of the other two zones
4. let sync catch up
5. bring up the rgw service on archive zone
6. wait for sync status to show as caught up

the archive zone ends up with a few objects having two versions.

from my analysis, it appears that the difference between the objects that have two versions and the ones that don't is the timing. Archive zone C fetches object from both zone A and zone B after a full sync bucket listing. if the object coming from zone A is still being written, and we haven't yet created an instance and updated the olh head object as part of write_meta(), then the second request we will still be reading the old object state and continue with creating a new instance.

Actions #3

Updated by Shilpa MJ 5 months ago

  • Status changed from New to Fix Under Review
  • Pull request ID set to 54908
Actions

Also available in: Atom PDF