Bug #20280
rgw: multi-site replication: switching master/secondary creates additional pool
0%
Description
We are testing multi-site replication between two clusters, plk041 and plk045, both on same ceph version 10.2.4 (9411351cc8ce9ee03fbd46225102fe3d28ddf611).
We have separate realm (replication), zonegroup (replication), zones (plk041-replication on plk041, plk045-replication on plk045), configured according to this doc: https://access.redhat.com/documentation/en-us/red_hat_ceph_storage/2/html-single/object_gateway_guide_for_red_hat_enterprise_linux/#multi_site
To test switching master/secondary we are doing the following:
1. turning off master
2. setting secondary cluster as master
# radosgw-admin --cluster=plk045 --rgw-realm=replication zone modify --rgw-zone=plk045-replication --master # radosgw-admin --cluster=plk045 --rgw-realm=replication period update --commit
3. creating new bucket using new master (plk045 here)
After that we can see the new empty pool on plk045 cluster named plk041-replication.rgw.buckets.index,
ceph -c /etc/ceph/plk045.conf osd pool ls detail ... pool 129 'plk041-replication.rgw.buckets.index' replicated size 3 min_size 1 crush_ruleset 0 object_hash rjenkins pg_num 8 pgp_num 8 last_change 18861 flags hashpspool stripe_width 0
This issue is not stably reproduced, unfortunately. So we need some advice with debugging/reproducing it.
History
#1 Updated by shasha lu almost 7 years ago
Here is hwo to reproduce:
1. configure the mutlisite env with two zonegroups.
2. Put some buckets and objects in these two zonegroups using the same user.
3. After 3600*24 s, the additional index pool will be created in secondary zonegroup.
UserSyncThread runs every 3600*24s , the addtional pool is created in this thread.
OPTION
Buckets and users will sync to peer zonegroup. Creating bucket in secondary zonegroup will forward to master zone. The cls_user_bucket_entry's index pool is always be master zone's index pool. UserSyncThread uses the index pool. So it will be created in secondary zonegroup. rgw_bucket_sync_user_stats should use BucketInfo.bucket instead of cls_user_bucket_entry's bucket.
#2 Updated by Andrey Tyurin almost 7 years ago
Thanks for explanation what is happening. However, we are using single zonegroup configuration and it's possible to reproduce additional index pool creation twice in half of hour, although not every time we switch, not waiting for 3600*24s.