Project

General

Profile

Bug #38486

rgw: abort multipart in lifecycle when enable index shard

Added by Eric Ivancich 6 months ago. Updated about 1 month ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Target version:
Start date:
02/26/2019
Due date:
% Done:

0%

Source:
Tags:
Backport:
mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

The multipart lifecycle rule is not able to clean up all entries because it skips over some bucket index shards.


Related issues

Copied to rgw - Backport #38711: mimic: rgw: abort multipart in lifecycle when enable index shard Need More Info
Copied to rgw - Backport #38712: luminous: rgw: abort multipart in lifecycle when enable index shard Need More Info

History

#1 Updated by Eric Ivancich 6 months ago

This issue was originally identified in a PR by yuliyang <> (github user joke-lee). The solution proposed in the PR was too specific, and a more general solution is preferable. See:

https://github.com/ceph/ceph/pull/26480

#2 Updated by Eric Ivancich 6 months ago

Reproduction steps as described by yulilang:

rgw_override_bucket_index_max_shards = 128
rgw_enable_lc_threads = false
rgw_lifecycle_work_time = "00:00-24:00"
rgw lc debug interval = 1

we set shard num 128

#!/usr/bin/env python
# -*- coding:utf-8 -*-
from boto3.session import Session
import boto3
access_key = "user1" 
secret_key = "user1" 
session = Session(access_key, secret_key)
url = "http://127.0.0.1" 
config_dict = { 'signature_version' : 's3', 'connect_timeout': 30000, 'read_timeout': 30000}
config = boto3.session.Config(**config_dict)
s3_client = session.client('s3', endpoint_url=url, config=config)
bucket = "test1" 
key = "异形契约.mp4" 
mpu = s3_client.create_multipart_upload(Bucket=bucket, Key=key)

and get which shard the obj store in

for i in `./bin/rados -p default.rgw.buckets.index ls -c ceph.conf`; do echo $i "\t start";./bin/rados -p default.rgw.buckets.index listomapkeys $i -c ceph.conf; echo "\t end";done

...
.dir.96e298ca-3b00-4aff-8faf-c13d35e14925.14133.1.39
_multipart_异形契约.mp4.2~7Be__tcdECK1K_OpIfYr645lzGoCxAZ.meta
...

the obj in shard 39

set the lc rule

from boto3.session import Session
import boto3
access_key = "user1" 
secret_key = "user1" 
session = Session(access_key, secret_key)
url = "http://127.0.0.1" 
config_dict = { 'signature_version' : 's3', 'connect_timeout': 30000, 'read_timeout': 30000}
config = boto3.session.Config(**config_dict)
s3_client = session.client('s3', endpoint_url=url, config=config)
s3_client.put_bucket_lifecycle(Bucket='test1', LifecycleConfiguration={
'Rules': [
{
    'ID': 'test',
    'Prefix': '',
    'Status': 'Enabled',
    'AbortIncompleteMultipartUpload': {
                    'DaysAfterInitiation': 1
                },
    'Expiration': {
                    'ExpiredObjectDeleteMarker': True
                }

},
]
}
)

and the abort will never execute

  uint32_t current_shard;
  if (shard_id >= 0) {
    current_shard = shard_id;
  } else if (my_start.empty()) {
    current_shard = 0u;
  } else {
    current_shard =
      rgw_bucket_shard_index(my_start.name, num_shards);  <== the old code go here and get the  current_shard value 101, so the obj in 101-128 shard will execute abort but the obj in shard 39 will  never execute,
  }

#3 Updated by Eric Ivancich 6 months ago

  • Pull request ID set to 26658

#4 Updated by Eric Ivancich 6 months ago

  • Status changed from In Progress to Need Review

#5 Updated by Eric Ivancich 6 months ago

  • Status changed from Need Review to Pending Backport

#6 Updated by Nathan Cutler 6 months ago

  • Copied to Backport #38711: mimic: rgw: abort multipart in lifecycle when enable index shard added

#7 Updated by Nathan Cutler 6 months ago

  • Copied to Backport #38712: luminous: rgw: abort multipart in lifecycle when enable index shard added

#8 Updated by Nathan Cutler about 1 month ago

  • Project changed from Ceph to rgw

Also available in: Atom PDF