Bug #43805
openbucket lifecycle breaks down when master-zone changed or period gets updated
0%
Description
If multisite metadata master moves to another zone (or period gets updated), lifecycle policy completely stops working. No objects will expire anymore in entire cluster and radosgw-admin lc list
returns empty list. Workaround is an update bucket lifecycle policy again for each buckets. Even though it is already present and we can get it.
Found in version 12.2.12
How to reproduce:
1. Deploy 2 ceph clusters, set up 2 multisite zones, e.g. master zone A and secondary B
2. Create bucket on zone A
3. Put bucket lifecycle
4. Run radosgw-admin lc list
on metadata master zone A
5. See bucket in the list # Lifecycle working
6. Change metadata master zone from A to B
7. Change metadata master zone back from B to A
8. Wait a few days and see empty list # Lifecycle not working
9. Put exactly same bucket lifecycle policy again
10. See bucket in the list # Lifecycle working
Updated by Casey Bodley about 4 years ago
need to verify whether metadata sync of a bucket instance is checking for lifecycle policy and adding the bucket to the lifecycle queue if necessary
Updated by Or Friedmann about 4 years ago
Hi it looks like using master branch is not reproduce this problem (lc stays on the master zone after changing the metadata master zone).
Can you please share a reproduce for master?
Thank you
Updated by Casey Bodley about 4 years ago
- Status changed from New to Triaged
- Tags set to lifecycle multisite
Hey Or,
Looking at RGWLC::set_bucket_config(), it first calls set_bucket_instance_attrs() to store the lifecycle policy (RGW_ATTR_LC) in the bucket instance metadata, and then calls cls_rgw_lc_set_entry() to add this bucket to the lifecycle processing queue.
In multisite, metadata sync will only replicate the changes to the bucket instance metadata. We need an exra step in metadata sync that updates the lifecycle processing queue accordingly.
We have a RGWMetadataHandlerPut_BucketInstance that processes writes to bucket instance metadata (whether via set_bucket_instance_attrs() or metadata sync). We should be able to add some logic there that detects when RGW_ATTR_LC is added or removed, and update the lifecycle processing queue accordingly.
Updated by Casey Bodley over 2 years ago
- Related to Bug #44268: multisite/lc: lc doesn't run in the slave added