Project

General

Profile

Bug #43805

bucket lifecycle breaks down when master-zone changed or period gets updated

Added by Mikhail Kharchenko 6 months ago. Updated 5 months ago.

Status:
Triaged
Priority:
Normal
Assignee:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
lifecycle multisite
Backport:
Regression:
Severity:
3 - minor
Reviewed:
01/24/2020
Affected Versions:
ceph-qa-suite:
rgw
Pull request ID:
Crash signature:

Description

If multisite metadata master moves to another zone (or period gets updated), lifecycle policy completely stops working. No objects will expire anymore in entire cluster and radosgw-admin lc list returns empty list. Workaround is an update bucket lifecycle policy again for each buckets. Even though it is already present and we can get it.

Found in version 12.2.12

How to reproduce:

1. Deploy 2 ceph clusters, set up 2 multisite zones, e.g. master zone A and secondary B
2. Create bucket on zone A
3. Put bucket lifecycle
4. Run radosgw-admin lc list on metadata master zone A
5. See bucket in the list # Lifecycle working
6. Change metadata master zone from A to B
7. Change metadata master zone back from B to A
8. Wait a few days and see empty list # Lifecycle not working
9. Put exactly same bucket lifecycle policy again
10. See bucket in the list # Lifecycle working

History

#1 Updated by Casey Bodley 5 months ago

need to verify whether metadata sync of a bucket instance is checking for lifecycle policy and adding the bucket to the lifecycle queue if necessary

#2 Updated by Or Friedmann 5 months ago

Hi it looks like using master branch is not reproduce this problem (lc stays on the master zone after changing the metadata master zone).

Can you please share a reproduce for master?

Thank you

#3 Updated by Casey Bodley 5 months ago

  • Status changed from New to Triaged
  • Tags set to lifecycle multisite

Hey Or,

Looking at RGWLC::set_bucket_config(), it first calls set_bucket_instance_attrs() to store the lifecycle policy (RGW_ATTR_LC) in the bucket instance metadata, and then calls cls_rgw_lc_set_entry() to add this bucket to the lifecycle processing queue.

In multisite, metadata sync will only replicate the changes to the bucket instance metadata. We need an exra step in metadata sync that updates the lifecycle processing queue accordingly.

We have a RGWMetadataHandlerPut_BucketInstance that processes writes to bucket instance metadata (whether via set_bucket_instance_attrs() or metadata sync). We should be able to add some logic there that detects when RGW_ATTR_LC is added or removed, and update the lifecycle processing queue accordingly.

Also available in: Atom PDF