Project

General

Profile

Actions

Bug #17987

open

incorrect entry is removed when a part is upload more than once during multipart-upload

Added by wangsu-os wangsu-os over 7 years ago. Updated 13 days ago.

Status:
Fix Under Review
Priority:
Normal
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Here is the discussion

2016-11-19 2:25 GMT+08:00 Yehuda Sadeh-Weinraub <>:
I understand. Yeah, on the surface it looks right. Can you send a pull request?

Thanks,
Yehuda

On Thu, Nov 17, 2016 at 9:40 PM, Jeegn Chen <> wrote:

Hi Yehuda,

The diff against the master branch is attached.

After bd8e026f88b812cc70caf6232c247844df5d99bf was introduced, if we upload
the same part of an upload-id more than once, some storage may leaked in
both bucket index and the bucket data pool.
For example, I try to upload 20MB.bin by multipart upload with 4MB parts
(Ceph 0.94). Say I upload the first part 4 times
The 1st time, multipart_20MB.bin.<upload-id>.1 will appears in the bucket
data pool and the bucket index
The 2nd time, multipart_20MB.bin.<upload-id>.1 is replaced with
multipart_20MB.bin.<random-1>.1
The 3rd time, multipart_20MB.bin.<upload-id>.1 will appears in the bucket
data pool and the bucket index again
The 4th time, multipart_20MB.bin.<upload-id>.1 is replaced with
multipart_20MB.bin.<random-2>.1
Later, I continue to upload other parts as usual and finally complete.
RGWCompleteMultipart::execute() will clean up the multipart entries in the
bucket index accordingly and replace with the final 20MB.bin entry.
Then multipart_20MB.bin.<random-1>.1 is actually leaked but
multipart_20MB.bin.<random-2>.1 is used by the final 20MB.bin file.
So I think whether we can go through the bucket index to find out the leaked
multipart_20MB.bin.<random-1>.1 and delete it.
But according to my experiment, both multipart_20MB.bin.<random-1>.1 and
multipart_20MB.bin.<random-2>.1 remain in the bucket index and the
information in the bucket index give no affirmative clue to tell which one
is still in use and which one is the leaked one.

After checking the code, my understanding is that
multipart_20MB.bin.<random-2>.1 entry in the bucket index is left behind by
mistake. I think the change in the diff may resolve the problem so that we
can use but bucket index to detect and release the leaked storage.

Thanks,
Jeegn

2016-11-18 0:31 GMT+08:00 Yehuda Sadeh-Weinraub <>:

On Thu, Nov 17, 2016 at 4:41 AM, Jeegn Chen <> wrote:

Hi Yehuda,

I'm tracking down some weird behavior in rgw multipart upload and notice
some logic in RGWCompleteMultipart::execute() that may be related to one
of
your commits.
"rgw: don't allow multiple writers to same multiobject part "

https://github.com/ceph/ceph/commit/bd8e026f88b812cc70caf6232c247844df5d99bf

I think code in RGWCompleteMultipart::execute() should need the
following
adjustment after you change was merged (Search random-alpha-numeric in
the
email to find where it is).

What do you think?

What's the problem that you're seeing, and can you provide the
following in a diff format? I'm having trouble finding out the exact
changes. (I think I know what you point at but just to make sure).

Thanks,
Yehuda

void RGWCompleteMultipart::execute() {
...
/* update manifest for part */
string oid = mp.get_part(obj_iter->second.num);
rgw_obj src_obj;
src_obj.init_ns(s->bucket, oid, mp_ns);

if (obj_part.manifest.empty()) {
ldout(s->cct, 0) << "ERROR: empty manifest for object part:
obj="
<< src_obj << dendl;
op_ret = -ERR_INVALID_PART;
return;
} else {
// mp.get_part() always produces the oid like
<obj-name>.<upload-id>.<num>
// However, it is not always true after
bd8e026f88b812cc70caf6232c247844df5d99bf is merged
//

https://github.com/ceph/ceph/commit/bd8e026f88b812cc70caf6232c247844df5d99bf
// When a part is uploaded more than once, the oid may be like
// <obj-name>.<random-alpha-numeric>.<num>
src_obj = obj_part.manifest.obj_begin().location;
manifest.append(obj_part.manifest);
}
...
rgw_obj_key remove_key;
src_obj.get_index_key(&remove_key);

remove_objs.push_back(remove_key);
}

Thanks,
Jeegn


Files

src_obj.diff (978 Bytes) src_obj.diff The diff mentioned in the thread wangsu-os wangsu-os, 11/22/2016 02:54 AM
Actions #2

Updated by Konstantin Shalygin 13 days ago

  • Status changed from New to Fix Under Review
  • Source set to Community (user)
  • Pull request ID set to 12152
Actions

Also available in: Atom PDF