Bug #20211
openrgw: bucket index not syncing when rados cluster abnormally
0%
Description
Bucket index syncing Issue occur in our environment.
I think one can reproduce it by steps:
1. deploy multisite and configure data sync
2. create a bucket
3. upload objects continuous to one site
4. make rados cluster abnormally in another site to cause bucket index add op failed when syncing object
one can find phenomenon below once the issue reproduced(after syncing complete and sync status show catch up):
1. do radosgw-admin bucket stats on both sites, they will have different object/usage stats
2. one can dip into the bucket index rados object using(on both sites):
rados listomapkeys -p <bucket index pool> <bucket index shard object>
you will see that the destination site will have less omap key/vals
3. one can use s3cmd ls s3://<bucket name>/ to list the object on both sites
you will see that destination site have less objects
4. Pick up one object that in the list of source site but not in the list of destination in step 3
strangely, you can download it on both sites using:
s3cmd get s3://<bucket name>/<object name>