Project

General

Profile

Bug #10365

radosgw-agent: buckets to retry logged by full sync won't be fully synced if they were uploaded before radosgw logging was enabled in the source zone

Added by Josh Durgin over 9 years ago. Updated about 8 years ago.

Status:
Closed
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When buckets to retry are handled by incremental sync, it uses the bucket index log to figure out which objects within the bucket need syncing. This does not include objects uploaded before logging was enabled in the zone.

One fix would be to store items to retry as tuples "[full|incremental], $item", and assume existing items to retry are from full sync for maximum coverage.

Suggested scheme:

sync initialialization
----------------------

for each shard:

  • read marker and buckets to retry for shard’s replica log
    - if no marker, check shard_id->buckets index
    if that doesn’t exist yet, list all buckets and create shard_id -> buckets index
  • for each bucket in current shard
    - if bucket doesn’t have entry in replica log:
    add to replica log, mark for full sync
    - if bucket exists in replica log, go to (incremental sync -> check type)

incremental sync:

  • look up data log shard in replica log to get buckets that need retrying, and current marker
  • check type of sync for any buckets that need to be retried, and generate the list of objects based on that
    - full -> list all objects in bucket
    - incremental -> read data log as usual
  • for each bucket
    - list objects in bucket that need to be synced based on sync type, from full list, or data log
    - read the bucket instance log for the bucket starting at marker from replica log
    - sync object using the same method as full sync
    - if syncing an object fails, add it to a list to retry
    - update bucket instance replica log with last marker read and list of objects to retry
  • once data log shard is done, update replica log for that shard with
    - new marker
    - list of bucket instances to retry

May need to be careful about updating with an empty marker, if e.g. lots of objects were uploaded before data logging was enabled. Perhaps use ‘ ‘ as the marker in that case, since it’s before all markers the gateway will generate.

History

#1 Updated by Alfredo Deza about 9 years ago

  • Description updated (diff)

#2 Updated by Alfredo Deza about 9 years ago

  • Description updated (diff)

#3 Updated by Yehuda Sadeh almost 9 years ago

  • Assignee set to Yehuda Sadeh

#4 Updated by Yehuda Sadeh about 8 years ago

  • Status changed from New to Closed

Fixed in v2

Also available in: Atom PDF