Project

General

Profile

Actions

Bug #20211

open

rgw: bucket index not syncing when rados cluster abnormally

Added by fang yuxiang almost 7 years ago. Updated 21 days ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Bucket index syncing Issue occur in our environment.

I think one can reproduce it by steps:

1. deploy multisite and configure data sync
2. create a bucket
3. upload objects continuous to one site
4. make rados cluster abnormally in another site to cause bucket index add op failed when syncing object

one can find phenomenon below once the issue reproduced(after syncing complete and sync status show catch up):

1. do radosgw-admin bucket stats on both sites, they will have different object/usage stats

2. one can dip into the bucket index rados object using(on both sites):
rados listomapkeys -p <bucket index pool> <bucket index shard object>
you will see that the destination site will have less omap key/vals

3. one can use s3cmd ls s3://<bucket name>/ to list the object on both sites
you will see that destination site have less objects

4. Pick up one object that in the list of source site but not in the list of destination in step 3
strangely, you can download it on both sites using:
s3cmd get s3://<bucket name>/<object name>

Actions #2

Updated by Nathan Cutler almost 7 years ago

  • Status changed from New to Fix Under Review
Actions #3

Updated by Yehuda Sadeh almost 7 years ago

I don't understand what step (4) is. What do you mean fail bucket index add op.

Actions #4

Updated by fang yuxiang almost 7 years ago

Yehuda Sadeh wrote:

I don't understand what step (4) is. What do you mean fail bucket index add op.

1. rados cluster abnormally lead bucket index add op failed in fetch remote object, but the object's data was put successfully

2. because step 1, that object can be download but will not be in s3 ListBucket results

3. That object will never be in s3 ListBucket results if it is not modified later, even we triger a new full sync

Surely, that object will be in s3 ListBucket if it is modified later.

Actions #5

Updated by fang yuxiang almost 7 years ago

Yehuda Sadeh wrote:

I don't understand what step (4) is. What do you mean fail bucket index add op.

I consider that rados cluster abnormal is very usual. Losts of objects will be "half syncing" state(they have whole data but don't have index), and they will stay same if not be modified in source site even we triger a new full sync using radosgw-admin, why? let's have a look at the code:

if (copy_if_newer) {  `always 'True' in syncing object`
/* need to get mtime for destination */
ret = get_obj_state(&obj_ctx, dest_obj, &dest_state, false); `will success for object data is whole`
if (ret < 0)
goto set_err_state;
if (!real_clock::is_zero(dest_state->mtime)) {
dest_mtime_weight.init(dest_state);
pmod = &dest_mtime_weight.mtime;
}
}
ret = conn->get_obj(user_id, info, src_obj, pmod, unmod_ptr,
dest_mtime_weight.zone_short_id, dest_mtime_weight.pg_ver,
true /* prepend_meta /, true / GET /, false / rgwx-stat */,
&cb, &in_stream_req);
if (ret < 0) {
`will fail for object is not modified in source site and we just copy if newer, will get ERR_NOT_MODIFIED`
goto set_err_state;
}

In our environment, radosgw-admin bucket stats show that slave site only has half the objects of master site(nearlly 50,000 objects look lost) after rados cluster in slave site abnormal, which make me panic for that I retriger a full sync does't work.

Then I just do ListBucket on both site, and pick one object in master site but not in slave site.
I found that object can be downloaded from slave site event it's not in ListBucket results of slave site.

so my fix is to make full sync work after things happen.

Actions #6

Updated by Casey Bodley almost 7 years ago

fang yuxiang wrote:

Yehuda Sadeh wrote:

I don't understand what step (4) is. What do you mean fail bucket index add op.

1. rados cluster abnormally lead bucket index add op failed in fetch remote object, but the object's data was put successfully

Can you tell what errors the bucket index op was returning?

Either way, I tend to agree that the copy_if_newer check should be based on the bucket index's timestamp, if at all possible.

Actions #7

Updated by fang yuxiang almost 7 years ago

Casey Bodley wrote:

fang yuxiang wrote:

Yehuda Sadeh wrote:

I don't understand what step (4) is. What do you mean fail bucket index add op.

1. rados cluster abnormally lead bucket index add op failed in fetch remote object, but the object's data was put successfully

Can you tell what errors the bucket index op was returning?

Either way, I tend to agree that the copy_if_newer check should be based on the bucket index's timestamp, if at all possible.

110 ETIMEOUT

Actions #8

Updated by fang yuxiang almost 7 years ago

Casey Bodley wrote:

fang yuxiang wrote:

Yehuda Sadeh wrote:

I don't understand what step (4) is. What do you mean fail bucket index add op.

1. rados cluster abnormally lead bucket index add op failed in fetch remote object, but the object's data was put successfully

Can you tell what errors the bucket index op was returning?

Either way, I tend to agree that the copy_if_newer check should be based on the bucket index's timestamp, if at all possible.

consider that bucket index add op is two phase(prepare and complete), we just know the return code is -110 from do_compete

Actions #9

Updated by Konstantin Shalygin 21 days ago

  • Assignee set to fang yuxiang
  • Source set to Community (user)
  • Pull request ID set to 15545
Actions

Also available in: Atom PDF