Project

General

Profile

Actions

Bug #16767

closed

RadosGW Multipart Cleanup Failure

Added by Brian Felton almost 8 years ago. Updated 4 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
% Done:

100%

Source:
other
Tags:
rgw multipart backport_processed
Backport:
quincy pacific reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

My current setup is a Ceph Hammer cluster running 0.94.6. The rest of the cluster details are irrelevant to this issue.

I've stumbled upon an issue whereby RGW is not cleaning up properly after a multipart upload is completed (either abort or complete). If a client re-uploads a part during a multipart upload, ceph will store both the original and new part, but only the latter part will be valid when POSTing the CompleteMultipartUpload XML payload. When the multipart upload is completed (either through abort or complete), only the initial parts will be removed from the system. The remaining parts are orphaned and are not (easily) removable.

To reproduce:

First, create four 5MiB files with unique md5 sums:

dd if=/dev/urandom of=/tmp/part1.1 bs=1M count=5
dd if=/dev/urandom of=/tmp/part1.2 bs=1M count=5
dd if=/dev/urandom of=/tmp/part2.1 bs=1M count=5
dd if=/dev/urandom of=/tmp/part2.2 bs=1M count=5

Next, initiate a multipart upload:

s3curl --id test -- -X POST http://ceph.cluster/bucket/mpobject?uploads

Upload the parts:

s3curl --id test --put /tmp/part1.1 -- http://ceph.cluster/bucket/mpobject?partNumber=1&uploadId=2~whateverid
s3curl --id test --put /tmp/part1.2 -- http://ceph.cluster/bucket/mpobject?partNumber=2&uploadId=2~whateverid
s3curl --id test --put /tmp/part2.1 -- http://ceph.cluster/bucket/mpobject?partNumber=1&uploadId=2~whateverid
s3curl --id test --put /tmp/part2.2 -- http://ceph.cluster/bucket/mpobject?partNumber=2&uploadId=2~whateverid

Now, let's take a look at what RGW says about the bucket:

radosgw-admin bucket stats --bucket=bucket | grep -A7 mptest | grep -v owner | grep -v instance

        "name": "mptest.2~o2LrKVtYqA_cwHAypOprHT-ANmTeH4S.1",
        "namespace": "multipart",
        "size": 5242880,
        "mtime": "2016-07-21 18:43:15.000000Z",
        "etag": "785dec7eeb68366cca5c19cec86c508b",
--
        "name": "mptest.2~o2LrKVtYqA_cwHAypOprHT-ANmTeH4S.2",
        "namespace": "multipart",
        "size": 5242880,
        "mtime": "2016-07-21 18:43:24.000000Z",
        "etag": "b11c15f456f17ba763d0fb900d22376c",
--
        "name": "mptest.2~o2LrKVtYqA_cwHAypOprHT-ANmTeH4S.meta",
        "namespace": "multipart",
        "size": 0,
        "mtime": "2016-07-21 18:43:00.000000Z",
        "etag": "",
--
        "name": "mptest.feXQAxbcmjR1WdN_-b-jj1BKcObJ3Q6.2",
        "namespace": "multipart",
        "size": 5242880,
        "mtime": "2016-07-21 18:43:39.000000Z",
        "etag": "2d26aa403bc759305d0ea61d29f17cd0",
--
        "name": "mptest.i0q6uZ-do4mYoW7z5z8JDAQitcGJ5No.1",
        "namespace": "multipart",
        "size": 5242880,
        "mtime": "2016-07-21 18:43:31.000000Z",
        "etag": "a9fdb9efe0722f6e61d5d4ff3dfe0e81",

So we now have a meta file that contains the upload id, the first two attempted parts containing the upload id in the name, and the two subsequent parts that do not contain the upload id in the name.

Now, let's list the available parts associated with the id:

./s3curl --id test -- http://ceph.cluster/bucket/mpobject?uploadId=2~whateverid | xmlstarlet fo

<?xml version="1.0" encoding="UTF-8"?>
<ListPartsResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
  <Bucket>bucket</Bucket>
  <Key>mptest</Key>
  <UploadId>2~whateverid</UploadId>
...
  <Owner>
    <ID>7e1af43925cbef79334d2da290d602d586d04d7dd9aeb970c95ab93c0641c1f4</ID>
    <DisplayName>t3os_test</DisplayName>
  </Owner>
  <Part>
    <LastModified>2016-07-21T18:43:31.000Z</LastModified>
    <PartNumber>1</PartNumber>
    <ETag>a9fdb9efe0722f6e61d5d4ff3dfe0e81</ETag>
    <Size>5242880</Size>
  </Part>
  <Part>
    <LastModified>2016-07-21T18:43:39.000Z</LastModified>
    <PartNumber>2</PartNumber>
    <ETag>2d26aa403bc759305d0ea61d29f17cd0</ETag>
    <Size>5242880</Size>
  </Part>
</ListPartsResult>

We see here that the available parts are the last two uploaded. So far, so good.

Now, let's go ahead and complete this thing.

{builds valid CompeteMultipartUpload document}
./s3curl --id test --post mp.test -- http://ceph.cluster/bucket/mpobject?uploadId=2~whateverid

Great success! I can now download the object, and it shows to be the valid combination of the last two parts I uploaded.

Now, however, let's take a look at our bucket:

radosgw-admin bucket list --bucket=bucket | grep -A7 mptest | grep -v owner | grep -v instance
        "name": "mptest.feXQAxbcmjR1WdN_-b-jj1BKcObJ3Q6.2",
        "namespace": "multipart",
        "size": 5242880,
        "mtime": "2016-07-21 18:43:39.000000Z",
        "etag": "2d26aa403bc759305d0ea61d29f17cd0",
--
        "name": "mptest.i0q6uZ-do4mYoW7z5z8JDAQitcGJ5No.1",
        "namespace": "multipart",
        "size": 5242880,
        "mtime": "2016-07-21 18:43:31.000000Z",
        "etag": "a9fdb9efe0722f6e61d5d4ff3dfe0e81",
--
        "name": "mptest",
        "namespace": "",
        "size": 10485760,
        "mtime": "2016-07-21 18:52:23.000000Z",
        "etag": "39967388ccf40f9570e7f3154549e589-2",

Upon completing the request, only the two parts tagged with the upload id are removed from the system. If I list out the .rgw.buckets pool, I can confirm that all of the parts are still present:

rados -p .rgw.buckets ls | grep mptest

default.7754.6__shadow_mptest.feXQAxbcmjR1WdN_-b-jj1BKcObJ3Q6.2_1
default.7754.6_mptest
default.7754.6__multipart_mptest.feXQAxbcmjR1WdN_-b-jj1BKcObJ3Q6.2
default.7754.6__multipart_mptest.2~o2LrKVtYqA_cwHAypOprHT-ANmTeH4S.2
default.7754.6__shadow_mptest.2~o2LrKVtYqA_cwHAypOprHT-ANmTeH4S.2_1
default.7754.6__multipart_mptest.2~o2LrKVtYqA_cwHAypOprHT-ANmTeH4S.1
default.7754.6__shadow_mptest.i0q6uZ-do4mYoW7z5z8JDAQitcGJ5No.1_1
default.7754.6__shadow_mptest.2~o2LrKVtYqA_cwHAypOprHT-ANmTeH4S.1_1
default.7754.6__multipart_mptest.i0q6uZ-do4mYoW7z5z8JDAQitcGJ5No.1

Aborting the upload yields similar results, except in reverse. In the abort case, the files that contain the upload id in the name will be retained, but the other files will be properly removed.

For small multipart uploads like this, the additional space used is trivial. But in our actual cluster, we have clients that are uploading considerably larger files and are noticing that their bucket utilization is tens of TB larger than the sum of the objects they can list. The files are not removed by garbage collection, and are generally only removable through a very slow process of listing the omap contents of the bucket shards in .rgw.buckets.index and removing the omap keys that cannot be found.


Related issues 7 (3 open4 closed)

Related to rgw - Bug #44660: Multipart re-uploads cause orphan dataPending Backport

Actions
Related to rgw - Bug #58369: When uploading parts in multipart upload, use the "AbortMultipartUpload" interface to end the upload, and there will be data that cannot be cleanedNewJ. Eric Ivancich

Actions
Related to rgw-testing - Bug #58780: scan for orphaned rados objects and index entries in rgw suitePending BackportJ. Eric Ivancich

Actions
Has duplicate rgw - Bug #57942: rgw leaks rados objects when a part is submitted multiple times in a multipart uploadDuplicate

Actions
Copied to rgw - Backport #59064: reef: RadosGW Multipart Cleanup FailureResolvedMykola GolubActions
Copied to rgw - Backport #59065: quincy: RadosGW Multipart Cleanup FailureResolvedMykola GolubActions
Copied to rgw - Backport #59066: pacific: RadosGW Multipart Cleanup FailureRejectedActions
Actions

Also available in: Atom PDF