Bug #43756
openAn error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown
0%
Description
Hi RGW Team,
The last 7 days we spent trying to solve a metering problem in the buckets.
Well, right now its looks like LifeCycle are not able to purge/delete some objects may be due to some parse problem.
Let post some information:
radosgw-admin user stats --uid=XXXXX
{
"stats": {
"total_entries": 22817077,
"total_bytes": 164278075532090,
"total_bytes_rounded": 164325122670592
},
"last_stats_sync": "2020-01-21 19:32:02.231796Z",
"last_stats_update": "2020-01-22 14:48:30.915696Z"
}
Aprox 164TB usage.
The customer got near 57 buckets with different sizes I'm going to post just one.
radosgw-admin bucket stats --bucket=Evol6
{
"bucket": "Evol6",
"tenant": "",
"zonegroup": "4d8c7c5f-ca40-4ee3-b5bb-b2cad90bd007",
"placement_rule": "default-placement",
"explicit_placement": {
"data_pool": "default.rgw.buckets.data",
"data_extra_pool": "default.rgw.buckets.non-ec",
"index_pool": "default.rgw.buckets.index"
},
"id": "48efb8c3-693c-4fe0-bbe4-fdc16f590a82.132873679.2",
"marker": "48efb8c3-693c-4fe0-bbe4-fdc16f590a82.3886182.52",
"index_type": "Normal",
"owner": "xxxxxx",
"ver": "0#91266,1#60635,2#80715,3#78528",
"master_ver": "0#0,1#0,2#0,3#0",
"mtime": "2020-01-21 22:38:31.437616Z",
"max_marker": "0#,1#,2#,3#",
"usage": {
"rgw.main": {
"size": 9107173119747,
"size_actual": 9107345551360,
"size_utilized": 9107173119747,
"size_kb": 8893723750,
"size_kb_actual": 8893892140,
"size_kb_utilized": 8893723750,
"num_objects": 180808
},
"rgw.multimeta": {
"size": 0,
"size_actual": 0,
"size_utilized": 3807,
"size_kb": 0,
"size_kb_actual": 0,
"size_kb_utilized": 4,
"num_objects": 141
}
},
"bucket_quota": {
"enabled": false,
"check_on_raw": false,
"max_size": -1024,
"max_size_kb": 0,
"max_objects": -1
}
}
Current size 9 TB approx. external Tools like S3 Browser // Cloudberry Amazon S3 Explorer reports 7TB.
Its a considerable difference but its not metadata overhead.
Checking with AWS CLI we found incompleted multipart, it's normal due customer backup thousand of remote computers and use CEPH as backend.
I found a small script to cancel all multipart using AWS CLI.
BUCKETNAME=Evol6
aws --endpoint=http://XXXXXX:7480 --profile=ceph s3api list-multipart-uploads --bucket $BUCKETNAME \
> | jq -r '.Uploads[] | "--key \"\(.Key)\" --upload-id \(.UploadId)"' \
> | while read -r line; do
> eval "aws --endpoint=http://XXXXXXXX:7480 --profile=ceph s3api abort-multipart-upload --bucket $BUCKETNAME $line";
> done
The output generates the same error:
An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown
An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown
An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown
Checking the multipart list :
{
"Initiator": {
"DisplayName": "xxxxx",
"ID": "xxxxx"
},
"Initiated": "2019-12-03T02:00:50.589Z",
"UploadId": "2~T7G76R09Pn-267VMbY8cjvZl_BHqfTx",
"StorageClass": "STANDARD",
"Key": "MBS-da43656f-2b8c-464f-b341-03fdbdf446ae/CBB_SRV2K12/CBB_VM/192.168.0.197/SRV2K12/Hard disk 1$/20191203010516/431.cbrevision",
"Owner": {
"DisplayName": "xxxx",
"ID": "xxxxx"
}
},
{
"Initiator": {
"DisplayName": "xxxxx",
"ID": "xxxx"
},
"Initiated": "2019-12-03T01:23:06.007Z",
"UploadId": "2~r0BMPPs8CewVZ6Qheu1s9WzaBn7bBvU",
"StorageClass": "STANDARD",
"Key": "MBS-da43656f-2b8c-464f-b341-03fdbdf446ae/CBB_SRV2K12/CBB_VM/192.168.0.197/SRV2K12/Hard disk 1$/20191203010516/431.cbrevision",
"Owner": {
"DisplayName": "xxxxx",
"ID": "xxxxx"
}
}
Maybe the parse internally of 1$ char is generating a problem in the LC scripts that don't allow getting purged.
The main problem of that issue is the huge difference between completed files that show in all external tools and the internal storage metering.
Additionally for help this type of customer we add a LC policy that for some reason fails but shows as completed.
s3cmd getlifecycle s3://Evol6 --no-ssl
<?xml version="1.0" ?>
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>Incomplete Multipart Uploads</ID>
<Prefix/>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>1</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
radosgw-admin lc list
[
{
"bucket": ":Evol6:48efb8c3-693c-4fe0-bbe4-fdc16f590a82.3886182.52",
"status": "COMPLETE"
}
]
Obviusly completed is not the correct error because in the multipart incomplete show like 157 incompleted uploads.
I appreciated all the help and ideas.
Best Regards