Bug #21152
closedCeph RGW, how to clean unfinished upload of multipart file
0%
Description
Hi,
I have Ceph Jewel (10.2.9-1~bpo80+1) run on Debian 8. I figure out that RGW doesn't clean orphan object.
Here my scenario
1. upload large file (through s3cmd v1.6)
2. in the middle of upload, cancel it (CRTL+C)
$ s3cmd put /tmp/dummy1 s3://test3/dummy13 '/tmp/dummy1' -> 's3://test3/dummy13' [part 1 of 69, 15MB] 15728640 of 15728640 100% in 0s 28.09 MB/s done '/tmp/dummy1' -> 's3://test3/dummy13' [part 2 of 69, 15MB] 15728640 of 15728640 100% in 0s 29.54 MB/s done '/tmp/dummy1' -> 's3://test3/dummy13' [part 3 of 69, 15MB] 15728640 of 15728640 100% in 0s 30.54 MB/s done '/tmp/dummy1' -> 's3://test3/dummy13' [part 4 of 69, 15MB] 15728640 of 15728640 100% in 0s 23.51 MB/s done '/tmp/dummy1' -> 's3://test3/dummy13' [part 5 of 69, 15MB] 65536 of 15728640 0% in 0s 395.55 kB/s^CERROR: Upload of '/tmp/dummy1' part 5 failed. Use ./s3cmd abortmp s3://test3/dummy13 2~__6VOBRl4oK54ZyUYlbNzxI22mS4YNo to abort the upload, or ./s3cmd --upload-id 2~__6VOBRl4oK54ZyUYlbNzxI22mS4YNo put ... to continue the upload. See ya!
3. repeate process (1) & (2)
$ s3cmd put /tmp/dummy1 s3://test3/dummy13 '/tmp/dummy1' -> 's3://test3/dummy13' [part 1 of 69, 15MB] 15728640 of 15728640 100% in 0s 29.81 MB/s done '/tmp/dummy1' -> 's3://test3/dummy13' [part 2 of 69, 15MB] 15728640 of 15728640 100% in 0s 30.71 MB/s done '/tmp/dummy1' -> 's3://test3/dummy13' [part 3 of 69, 15MB] 15728640 of 15728640 100% in 0s 23.87 MB/s done '/tmp/dummy1' -> 's3://test3/dummy13' [part 4 of 69, 15MB] 15728640 of 15728640 100% in 0s 30.99 MB/s done '/tmp/dummy1' -> 's3://test3/dummy13' [part 5 of 69, 15MB] 15728640 of 15728640 100% in 0s 30.49 MB/s done '/tmp/dummy1' -> 's3://test3/dummy13' [part 6 of 69, 15MB] 65536 of 15728640 0% in 0s 449.04 kB/s^CERROR: Upload of '/tmp/dummy1' part 6 failed. Use ./s3cmd abortmp s3://test3/dummy13 2~ErUMnOWeyPd-NAsDm-jz_ikXd1DZEaf to abort the upload, or ./s3cmd --upload-id 2~ErUMnOWeyPd-NAsDm-jz_ikXd1DZEaf put ... to continue the upload. See ya! $ s3cmd put /tmp/dummy1 s3://test3/dummy13 '/tmp/dummy1' -> 's3://test3/dummy13' [part 1 of 69, 15MB] 15728640 of 15728640 100% in 0s 29.43 MB/s done '/tmp/dummy1' -> 's3://test3/dummy13' [part 2 of 69, 15MB] 65536 of 15728640 0% in 0s 451.74 kB/s^CERROR: Upload of '/tmp/dummy1' part 2 failed. Use ./s3cmd abortmp s3://test3/dummy13 2~msRDcNOcefjMQdIuEGSnqjWgSVbvyHs to abort the upload, or ./s3cmd --upload-id 2~msRDcNOcefjMQdIuEGSnqjWgSVbvyHs put ... to continue the upload. See ya!
4. list file on the bucket, there is no "dummy13" on the bucket
$ s3cmd ls s3://test3 2017-08-28 07:58 1048576 s3://test3/dummy3 2017-08-28 10:03 1048576 s3://test3/dummy9
5. list object file on data pool
$ rados -p ceph-us-east-2.rgw.buckets.data ls | grep -v shadow | sort |less|grep dummy13 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~__6VOBRl4oK54ZyUYlbNzxI22mS4YNo.1 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~__6VOBRl4oK54ZyUYlbNzxI22mS4YNo.2 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~__6VOBRl4oK54ZyUYlbNzxI22mS4YNo.3 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~__6VOBRl4oK54ZyUYlbNzxI22mS4YNo.4 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~__6VOBRl4oK54ZyUYlbNzxI22mS4YNo.5 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~ErUMnOWeyPd-NAsDm-jz_ikXd1DZEaf.1 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~ErUMnOWeyPd-NAsDm-jz_ikXd1DZEaf.2 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~ErUMnOWeyPd-NAsDm-jz_ikXd1DZEaf.3 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~ErUMnOWeyPd-NAsDm-jz_ikXd1DZEaf.4 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~ErUMnOWeyPd-NAsDm-jz_ikXd1DZEaf.5 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~msRDcNOcefjMQdIuEGSnqjWgSVbvyHs.1 a7d1a615-622a-4307-a726-4304bdd56a5a.1354935.3__multipart_dummy13.2~msRDcNOcefjMQdIuEGSnqjWgSVbvyHs.2 $ ceph df GLOBAL: SIZE AVAIL RAW USED %RAW USED 111G 105G 6371M 5.56 POOLS: NAME ID USED %USED MAX AVAIL OBJECTS ... ceph-us-east-2.rgw.control 67 0 0 33896M 8 ceph-us-east-2.rgw.data.root 68 1959 0 33896M 6 ceph-us-east-2.rgw.gc 69 0 0 33896M 1000 ceph-us-east-2.rgw.log 70 50 0 33896M 134 ceph-us-east-2.rgw.users.uid 71 328 0 33896M 2 ceph-us-east-2.rgw.users.keys 72 13 0 33896M 1 ceph-us-east-2.rgw.buckets.index 73 0 0 33896M 3 ceph-us-east-2.rgw.buckets.non-ec 74 0 0 33896M 9 ceph-us-east-2.rgw.buckets.data 75 1832M 5.13 33896M 493 ...
On step 5, there are "*dummy13*" objects with 3 different suffix (2~__6VOBRl4oK54ZyUYlbNzxI22mS4YNo; 2~ErUMnOWeyPd-NAsDm-jz_ikXd1DZEaf; 2~msRDcNOcefjMQdIuEGSnqjWgSVbvyHs). Those objects still exist even gc process run.
Is there a way to clean it up?
Updated by Casey Bodley over 2 years ago
- Status changed from New to Closed
we support the s3 AbortMultipartUpload api (https://docs.aws.amazon.com/AmazonS3/latest/API/API_AbortMultipartUpload.html) to clean up incomplete multipart uploads. the s3cmd output is even telling you how what command to run to abort or retry the upload:
Upload of '/tmp/dummy1' part 2 failed. Use
./s3cmd abortmp s3://test3/dummy13 2~msRDcNOcefjMQdIuEGSnqjWgSVbvyHs
to abort the upload, or
./s3cmd --upload-id 2~msRDcNOcefjMQdIuEGSnqjWgSVbvyHs put ...
to continue the upload.