Bug #58911: multisite: Race condition in replication causes objects that should be deleted to persist - rgw - Ceph

Actions

Copy link

Bug #58911

closed

multisite: Race condition in replication causes objects that should be deleted to persist

Added by Tom Coldrick about 1 year ago. Updated 10 months ago.

Status:

Resolved

Priority:

Normal

Assignee:

Tom Coldrick

Target version:

% Done:

Source:

Tags:

multisite backport_processed

Backport:

reef

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

Ceph - v18.0.0

ceph-qa-suite:

Pull request ID:

51715

Crash signature (v1):

Crash signature (v2):

Description

A race condition in multisite replication can allow objects that should be deleted to be copied back from another site, resulting in inconsistent state between zones. The final behaviour is that the zone which recieves the workload ends up with some objects which should be deleted still present. I've tested this for (active-active) multisite replication between two zones.

The most reliable method of reproducing this I've found thus far is the following warp command:

warp mixed --host=<endpoint> --access-key=<access key> --secret-key=<secret key> --noclear --objects 250 --put-distrib 50 --delete-distrib 50 --get-distrib 0 --stat-distrib 0 --concurrent 10 --duration 60s

As you can see, this restricts to only PUTs and DELETEs on a single bucket. After running this, and waiting for the bucket sync to finish, I can see that the zone receiving objects has (on this example run) 384 objects, as opposed to 232 for the zone acting as a secondary. Let's call the zone targeted by the workload A, and its peer B in this case. Looking through the logs for a single object present in A but missing in B, we can see the following events occurred:

1. Object was PUT into A
2. Replication begins A->B
3. Object is replicated A->B by Full Sync
4. Object is deleted in A
5. Replication begins B->A
6. Object is replicated back B->A by Full Sync
7. Delete is replicated A->B by Incremental Sync
8. Delete tries to replicate B->A, but is skipped as A is present in the zone trace

At this point, A contains an object that should be deleted. Of course, the type of workload where this could happen is a little strange, in that creation and deletion must happen in quick succession, but I think this is still a problem. Note that in the above steps, (4) and (5) may be reversed with the same result -- we can still hit the problem even if there's already a full sync going on B->A when the object is deleted, provided the delete happens before we attempt to replicate the object back over.

I've only tested this on the main branch, but I don't think there's any reason it need be limited to it.

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » rgw

Custom queries

Bug #58911

multisite: Race condition in replication causes objects that should be deleted to persist

Updated by Tom Coldrick about 1 year ago

Updated by Casey Bodley about 1 year ago

Updated by Casey Bodley about 1 year ago

Updated by Tom Coldrick 12 months ago

Updated by Casey Bodley 12 months ago

Updated by Casey Bodley 11 months ago

Updated by Backport Bot 11 months ago

Updated by Backport Bot 11 months ago

Updated by Mark Kogan 10 months ago