Project

General

Profile

Actions

Bug #58911

closed

multisite: Race condition in replication causes objects that should be deleted to persist

Added by Tom Coldrick about 1 year ago. Updated 10 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
multisite backport_processed
Backport:
reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

A race condition in multisite replication can allow objects that should be deleted to be copied back from another site, resulting in inconsistent state between zones. The final behaviour is that the zone which recieves the workload ends up with some objects which should be deleted still present. I've tested this for (active-active) multisite replication between two zones.

The most reliable method of reproducing this I've found thus far is the following warp command:

warp mixed --host=<endpoint> --access-key=<access key> --secret-key=<secret key> --noclear --objects 250 --put-distrib 50 --delete-distrib 50 --get-distrib 0 --stat-distrib 0 --concurrent 10 --duration 60s

As you can see, this restricts to only PUTs and DELETEs on a single bucket. After running this, and waiting for the bucket sync to finish, I can see that the zone receiving objects has (on this example run) 384 objects, as opposed to 232 for the zone acting as a secondary. Let's call the zone targeted by the workload A, and its peer B in this case. Looking through the logs for a single object present in A but missing in B, we can see the following events occurred:

1. Object was PUT into A
2. Replication begins A->B
3. Object is replicated A->B by Full Sync
4. Object is deleted in A
5. Replication begins B->A
6. Object is replicated back B->A by Full Sync
7. Delete is replicated A->B by Incremental Sync
8. Delete tries to replicate B->A, but is skipped as A is present in the zone trace

At this point, A contains an object that should be deleted. Of course, the type of workload where this could happen is a little strange, in that creation and deletion must happen in quick succession, but I think this is still a problem. Note that in the above steps, (4) and (5) may be reversed with the same result -- we can still hit the problem even if there's already a full sync going on B->A when the object is deleted, provided the delete happens before we attempt to replicate the object back over.

I've only tested this on the main branch, but I don't think there's any reason it need be limited to it.


Related issues 1 (0 open1 closed)

Copied to rgw - Backport #61630: reef: multisite: Race condition in replication causes objects that should be deleted to persistResolvedActions
Actions

Also available in: Atom PDF