Project

General

Profile

Actions

Bug #45201

open

multisite: buckets deleted on secondary remain on master

Added by Michael B about 4 years ago. Updated almost 4 years ago.

Status:
New
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
multisite
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Testing multisite with one secondary, the following is as expected:

create bucket on primary; bucket appears on secondary
create bucket on secondary; bucket appears on primary
delete bucket on primary; bucket disappears on secondary

BUT

delete bucket on secondary; bucket remains on primary

It doesn't matter whether the bucket being deleted was originally
created on the primary or the secondary.

This is not what I expect with Active-Active replication.

This could result in substantial confusion or perhaps data loss.
(Q. What happens if a bucket of the same name as the deleted bucket
is created on the secondary? A. It allows it to be created.
Does this empty the contents of the bucket that remains on the primary?
I haven't checked.)

Is it a known limitation of the implementation? If so, where is
it described in the docs?

It should have nothing to do with the "sync policy" feature
as I haven't defined one:
radosgw-admin sync policy get
ERROR: failed to get policy: (22) Invalid argument

(I don't find radosgw-admin's error messages very helpful.)

The output of sync status on both sides indicates no problem.

radosgw-admin sync status
realm 114f1a32-cfbb-4531-94fb-1e0e3106dc03 (_b076afbb-8824-470d-aaaf-2ec1cb3b3eab)
zonegroup b076afbb-8824-470d-aaaf-2ec1cb3b3eab (_298a10d5-6785-4bab-ac68-c3d1f371a771)
zone 298a10d5-6785-4bab-ac68-c3d1f371a771 (store-mb1)
metadata sync no sync (zone is master)
data sync source: af88fd26-bdb3-4efe-8338-e96a26377922 (store-mb2)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source

radosgw-admin sync status
realm 114f1a32-cfbb-4531-94fb-1e0e3106dc03 (_b076afbb-8824-470d-aaaf-2ec1cb3b3eab)
zonegroup b076afbb-8824-470d-aaaf-2ec1cb3b3eab (_298a10d5-6785-4bab-ac68-c3d1f371a771)
zone af88fd26-bdb3-4efe-8338-e96a26377922 (store-mb2)
metadata sync syncing
full sync: 0/64 shards
incremental sync: 64/64 shards
metadata is caught up with master
data sync source: 298a10d5-6785-4bab-ac68-c3d1f371a771 (store-mb1)
syncing
full sync: 0/128 shards
incremental sync: 128/128 shards
data is caught up with source

Actions #1

Updated by Michael B about 4 years ago

Based on https://docs.ceph.com/docs/master/radosgw/multisite/
"Important You must execute metadata operations, such as user creation,
on a host within the master zone. The master zone and the secondary zone
can receive bucket operations, but the secondary zone redirects bucket
operations to the master zone. If the master zone is down, bucket
operations will fail."
it looks like the bucket create operations are redirected to the master,
but a bucket delete is not, so it takes effect only locally on the secondary.

ceph --version
ceph version 14.2.8 (2d095e947a02261ce61424021bb43bd3022d35cb) nautilus (stable)

Actions #2

Updated by Michael B about 4 years ago

I experimented further today to confirm a data loss scenario.

The buckets I experimented with yesterday were empty.
It is possible to delete a bucket only if it is empty, so in the scenario
where a bucket is deleted on the secondary but remains on the primary,
recreating a bucket with the same name on the secondary cannot lead
immediately to data loss.

However, I tried this:

  • create bucket1 through dashboard of primary
  • create bucket2 through dashboard of secondary
  • delete both through dashboard of secondary; both still visible on primary
    (as previously described - bad, but no data loss yet...)
  • create duplicate bucket2 through dashboard of secondary
  • use s3cmd to add an object to bucket2 through the secondary
    - the object is copied to the original bucket2 on the primary (good)
  • use s3cmd to add an object to bucket1 through the primary
    - bucket1 is not recreated on the secondary (very bad)

We now have a silent failure to replicate bucket2 from the primary to the secondary.
Note that the object put into the bucket in each case was a one-line text file.
After a few minutes I did (on the secondary; the primary said the same):

radosgw-admin sync status
          realm 46f6fa65-3cc1-47c4-8184-f0f8a7441097 (_89175832-1dec-45f8-8544-c0b2bfe8f707)
      zonegroup 89175832-1dec-45f8-8544-c0b2bfe8f707 (_89175832-1dec-45f8-8544-c0b2bfe8f707)
           zone 203de9cc-284a-4102-aa87-f0b04ad66130 (store-mb2)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is caught up with master
      data sync source: 581307c1-6f65-44c7-9a88-b4e364dcad22 (store-mb1)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        1 shards are recovering
                        recovering shards: [0]

A little later, I repeated it and got:

radosgw-admin sync status
          realm 46f6fa65-3cc1-47c4-8184-f0f8a7441097 (_89175832-1dec-45f8-8544-c0b2bfe8f707)
      zonegroup 89175832-1dec-45f8-8544-c0b2bfe8f707 (_89175832-1dec-45f8-8544-c0b2bfe8f707)
           zone 203de9cc-284a-4102-aa87-f0b04ad66130 (store-mb2)
  metadata sync syncing
                full sync: 0/64 shards
                incremental sync: 64/64 shards
                metadata is caught up with master
      data sync source: 581307c1-6f65-44c7-9a88-b4e364dcad22 (store-mb1)
                        syncing
                        full sync: 0/128 shards
                        incremental sync: 128/128 shards
                        data is caught up with source

But there was still no copy of bucket1 on the secondary.

Actions #3

Updated by Michael B about 4 years ago

We now have a silent failure to replicate bucket2 from the primary to the secondary.

should of course read "failure to replicate bucket1"

Actions #4

Updated by Greg Farnum almost 4 years ago

  • Project changed from Ceph to rgw
Actions #5

Updated by Casey Bodley almost 4 years ago

  • Assignee set to Shilpa MJ
  • Tags set to multisite
Actions

Also available in: Atom PDF