Bug #43756: An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown - rgw - Ceph

Actions

Copy link

Bug #43756

open

An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown

Added by Manuel Rios over 4 years ago. Updated about 4 years ago.

Status:

Triaged

Priority:

Normal

Assignee:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v14.2.6

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Hi RGW Team,

The last 7 days we spent trying to solve a metering problem in the buckets.

Well, right now its looks like LifeCycle are not able to purge/delete some objects may be due to some parse problem.

Let post some information:

radosgw-admin user stats --uid=XXXXX
{
    "stats": {
        "total_entries": 22817077,
        "total_bytes": 164278075532090,
        "total_bytes_rounded": 164325122670592
    },
    "last_stats_sync": "2020-01-21 19:32:02.231796Z",
    "last_stats_update": "2020-01-22 14:48:30.915696Z" 
}

Aprox 164TB usage.

The customer got near 57 buckets with different sizes I'm going to post just one.

radosgw-admin bucket stats --bucket=Evol6
{
    "bucket": "Evol6",
    "tenant": "",
    "zonegroup": "4d8c7c5f-ca40-4ee3-b5bb-b2cad90bd007",
    "placement_rule": "default-placement",
    "explicit_placement": {
        "data_pool": "default.rgw.buckets.data",
        "data_extra_pool": "default.rgw.buckets.non-ec",
        "index_pool": "default.rgw.buckets.index" 
    },
    "id": "48efb8c3-693c-4fe0-bbe4-fdc16f590a82.132873679.2",
    "marker": "48efb8c3-693c-4fe0-bbe4-fdc16f590a82.3886182.52",
    "index_type": "Normal",
    "owner": "xxxxxx",
    "ver": "0#91266,1#60635,2#80715,3#78528",
    "master_ver": "0#0,1#0,2#0,3#0",
    "mtime": "2020-01-21 22:38:31.437616Z",
    "max_marker": "0#,1#,2#,3#",
    "usage": {
        "rgw.main": {
            "size": 9107173119747,
            "size_actual": 9107345551360,
            "size_utilized": 9107173119747,
            "size_kb": 8893723750,
            "size_kb_actual": 8893892140,
            "size_kb_utilized": 8893723750,
            "num_objects": 180808
        },
        "rgw.multimeta": {
            "size": 0,
            "size_actual": 0,
            "size_utilized": 3807,
            "size_kb": 0,
            "size_kb_actual": 0,
            "size_kb_utilized": 4,
            "num_objects": 141
        }
    },
    "bucket_quota": {
        "enabled": false,
        "check_on_raw": false,
        "max_size": -1024,
        "max_size_kb": 0,
        "max_objects": -1
    }
}

Current size 9 TB approx. external Tools like S3 Browser // Cloudberry Amazon S3 Explorer reports 7TB.
Its a considerable difference but its not metadata overhead.

Checking with AWS CLI we found incompleted multipart, it's normal due customer backup thousand of remote computers and use CEPH as backend.

I found a small script to cancel all multipart using AWS CLI.

BUCKETNAME=Evol6
aws  --endpoint=http://XXXXXX:7480 --profile=ceph s3api list-multipart-uploads --bucket $BUCKETNAME \
> | jq -r '.Uploads[] | "--key \"\(.Key)\" --upload-id \(.UploadId)"' \
> | while read -r line; do
>     eval "aws  --endpoint=http://XXXXXXXX:7480 --profile=ceph s3api abort-multipart-upload --bucket $BUCKETNAME $line";
> done

The output generates the same error:

An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown

An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown

An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown

Checking the multipart list :

{
            "Initiator": {
                "DisplayName": "xxxxx",
                "ID": "xxxxx" 
            },
            "Initiated": "2019-12-03T02:00:50.589Z",
            "UploadId": "2~T7G76R09Pn-267VMbY8cjvZl_BHqfTx",
            "StorageClass": "STANDARD",
            "Key": "MBS-da43656f-2b8c-464f-b341-03fdbdf446ae/CBB_SRV2K12/CBB_VM/192.168.0.197/SRV2K12/Hard disk 1$/20191203010516/431.cbrevision",
            "Owner": {
                "DisplayName": "xxxx",
                "ID": "xxxxx" 
            }
        },
  {
            "Initiator": {
                "DisplayName": "xxxxx",
                "ID": "xxxx" 
            },
            "Initiated": "2019-12-03T01:23:06.007Z",
            "UploadId": "2~r0BMPPs8CewVZ6Qheu1s9WzaBn7bBvU",
            "StorageClass": "STANDARD",
            "Key": "MBS-da43656f-2b8c-464f-b341-03fdbdf446ae/CBB_SRV2K12/CBB_VM/192.168.0.197/SRV2K12/Hard disk 1$/20191203010516/431.cbrevision",
            "Owner": {
                "DisplayName": "xxxxx",
                "ID": "xxxxx" 
            }
        }

Maybe the parse internally of 1$ char is generating a problem in the LC scripts that don't allow getting purged.

The main problem of that issue is the huge difference between completed files that show in all external tools and the internal storage metering.

Additionally for help this type of customer we add a LC policy that for some reason fails but shows as completed.

s3cmd getlifecycle s3://Evol6 --no-ssl
<?xml version="1.0" ?>
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
        <Rule>
                <ID>Incomplete Multipart Uploads</ID>
                <Prefix/>
                <Status>Enabled</Status>
                <AbortIncompleteMultipartUpload>
                        <DaysAfterInitiation>1</DaysAfterInitiation>
                </AbortIncompleteMultipartUpload>
        </Rule>
</LifecycleConfiguration>

radosgw-admin lc list
[
    {
        "bucket": ":Evol6:48efb8c3-693c-4fe0-bbe4-fdc16f590a82.3886182.52",
        "status": "COMPLETE" 
    }
]

Obviusly completed is not the correct error because in the multipart incomplete show like 157 incompleted uploads.

I appreciated all the help and ideas.

Best Regards

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » rgw

Custom queries

Bug #43756

An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown

Updated by Manuel Rios over 4 years ago

Updated by Manuel Rios over 4 years ago

Updated by Robin Johnson over 4 years ago

Updated by Robin Johnson over 4 years ago

Updated by J. Eric Ivancich over 4 years ago

Updated by Manuel Rios over 4 years ago

Updated by Manuel Rios over 4 years ago

Updated by Manuel Rios over 4 years ago

Updated by Manuel Rios over 4 years ago

Updated by Or Friedmann over 4 years ago

Updated by Casey Bodley over 4 years ago

Updated by Manuel Rios about 4 years ago

Updated by Casey Bodley about 4 years ago

Updated by Chris Jones about 4 years ago