Project

General

Profile

Bug #17164

Orphan data gets leaked on Bucket deletion

Added by Praveen Kumar G T about 2 years ago. Updated about 1 month ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
Start date:
08/29/2016
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

If we have orphan data in the bucket because of partial Multipart upload. The rados objects corresponding to the Multipart upload gets leaked when we delete the bucket.

  • Steps to reproduce *
    1) take a Snapshot of "rados df" command
    2) Take a snapshot of items in "radosgw-admin gc list --include-all"
    3) Create a bucket
    3) Do a multipart upload without completing or cancelling the upload, hence creating orphan data
    4) Remove the bucket
    5) Check for orphan data
    6) Check the output of "rados df" and "radosgw-admin gc list --include-all"

Logs for the simulation

  • Step 1 [ Output of rados df ] *
    pool name KB objects clones degraded unfound rd rd KB wr wr KB
    .in-chennai-1.intent-log 0 0 0 0 0 0 0 0 0
    .in-chennai-1.log 0 0 0 0 0 0 0 0 0
    .in-chennai-1.rgw 0 0 0 0 0 0 0 0 0
    .in-chennai-1.rgw.buckets 0 0 0 0 0 0 0 0 0
    .in-chennai-1.rgw.buckets.extra 0 0 0 0 0 0 0 0 0
    .in-chennai-1.rgw.buckets.index 0 0 0 0 0 0 0 0 0
    .in-chennai-1.rgw.control 0 8 0 0 0 0 0 0 0
    .in-chennai-1.rgw.gc 0 320 0 0 0 1382451 1382131 921634 0
    .in-chennai-1.usage 0 0 0 0 0 0 0 0 0
    .in-chennai-1.users 0 0 0 0 0 0 0 0 0
    .in-chennai-1.users.email 0 0 0 0 0 0 0 0 0
    .in-chennai-1.users.swift 0 0 0 0 0 0 0 0 0
    .in-chennai-1.users.uid 0 0 0 0 0 0 0 0 0
    .in-chennai.rgw.root 2 2 0 0 0 15 12 3 4
    .in.rgw.root 1 2 0 0 0 21 14 2 2
    total used 11018124 332
    total avail 23378484804
    total space 23389502928

There are zero items in rgw buckets

  • Step 2 [ Output of gc list ] *
    [dev--mon~/] sudo radosgw-admin gc list --include-all
    []

There are zero items in GC list

  • Step 3 [ Create the bucket ] *
    [deb~/python] s3cmd ls

[deb~/python] s3cmd mb s3://praveen
Bucket 's3://praveen/' created

[deb~/python] s3cmd ls
2016-08-29 11:03 s3://praveen

[deb~/python] python orphan_find.c.py
main
<Bucket: praveen>
0

No orphans in the bucket

  • Step 4 [ Multipart upload done without complete, Hence we have orphan data ] *

[deb~/python] python orphan_find.c.py
main
<Bucket: praveen>
[<Part 1>, <Part 2>, <Part 3>, <Part 4>]
<MultiPartUpload 10:25:09.084077>
21000000

  • Step 5 [ Remove the bucket ] *

[deb~/python] s3cmd rb s3://praveen
Bucket 's3://praveen/' removed

  • Step 6 *

No orphan data

[deb~/python] python orphan_find.c.py
main
[deb~/python]

No items in GC list

[dev--mon~/] sudo radosgw-admin gc list --include-all
[]

Items still exist in rgw.bucket pool

[dev--mon~/] ./radosdf
pool name KB objects clones degraded unfound rd rd KB wr wr KB
.in-chennai-1.intent-log 0 0 0 0 0 0 0 0 0
.in-chennai-1.log 0 3 0 0 0 18 15 36 0
.in-chennai-1.rgw 0 0 0 0 0 18 14 14 3
.in-chennai-1.rgw.buckets 20508 7 0 0 0 8 4 57 20510
.in-chennai-1.rgw.buckets.extra 0 1 0 0 0 1 1 6 0
.in-chennai-1.rgw.buckets.index 0 32 0 0 0 148 84 126 0
.in-chennai-1.rgw.control 0 8 0 0 0 0 0 0 0
.in-chennai-1.rgw.gc 0 320 0 0 0 1387520 1387200 924800 0
.in-chennai-1.usage 0 1 0 0 0 9 9 18 0
.in-chennai-1.users 1 1 0 0 0 2 1 1 1
.in-chennai-1.users.email 0 0 0 0 0 0 0 0 0
.in-chennai-1.users.swift 0 0 0 0 0 0 0 0 0
.in-chennai-1.users.uid 1 2 0 0 0 15 13 9 1
.in-chennai.rgw.root 2 2 0 0 0 27 22 3 4
.in.rgw.root 1 2 0 0 0 33 22 2 2
total used 11093112 379
total avail 23378409816
total space 23389502928

We have 7 objects in rgw.buckets pool, because I uploaded 21 MB with chunk size 6MB. Rados object stripe size is 4 MB. hence we have 2 objects per 6MB chunk and 1 rados object for the last 3MB chunk. Totally 7 MB


Related issues

Copied to rgw - Backport #21347: jewel: Orphan data gets leaked on Bucket deletion Rejected

History

#1 Updated by Praveen Kumar G T about 2 years ago

Fix.

The fix for the above issue is in delete_bucket function at
delete_bucket at src/rgw/rgw_rados.cc

https://github.com/ceph/ceph/blob/c19ecb05e457905225648ab4a6d59bfb284f5137/src/rgw/rgw_rados.cc#L7351

for (eiter = ent_map.begin(); eiter != ent_map.end(); ++eiter) {
obj = eiter->second.key;
if (rgw_obj::translate_raw_obj_to_obj_in_ns(obj.name, instance, ns))
return -ENOTEMPTY;
}

We should return ENOTEMPTY if we find any kind of object and not just objects that are visible to users, i.e. if ent_map.size() is not equal zero, we should return ENOTEMPTY

#2 Updated by Josh Durgin about 2 years ago

  • Project changed from Ceph to rgw
  • Category deleted (22)

#3 Updated by Matt Benjamin about 2 years ago

  • Status changed from New to In Progress
  • Assignee set to Orit Wasserman

#4 Updated by Praveen Kumar G T about 2 years ago

I have submitted a fix for review to fix this problem.

https://github.com/ceph/ceph/pull/10920

#5 Updated by Orit Wasserman about 2 years ago

  • Assignee changed from Orit Wasserman to Matt Benjamin

#6 Updated by Abhishek Varshney over 1 year ago

Reviving this old thread by a colleague, the PR submitted earlier [1] had failed teuthology rgw run, due to radosgw-admin consistently failing to remove a user with --purge-data flag. I tried to root cause the issue, and it turned out that incomplete multiparts need to be aborted when doing bucket rm with --purge-data. Here is the new PR (https://github.com/ceph/ceph/pull/15630) which handles incomplete multiparts with behaviour as given below:

  • radosgw-admin user/bucket rm with incomplete multiparts would return bucket not empty error.
  • radosgw-admin user/bucket rm --purge-data with incomplete multiparts would abort the pending multiparts and delete the bucket.
  • S3 delete bucket API with incomplete multiparts would return bucket not empty error. The expectation here is on the user to either complete or cancel all pending multipart uploads before deleting the bucket.

Requesting review on this PR.

PS : The check for an empty bucket index here [2] in the previous PR [1] has been removed, as we found instances of inconsistent bucket index with stale entries, without corresponding objects present in data pool. This would have prevented the deletion of an empty bucket with such inconsistent indexes. I am not sure on how to reproduce such a scenario though.

[1] https://github.com/ceph/ceph/pull/10920
[2] https://github.com/ceph/ceph/pull/10920/files#diff-c30965955342b98393b73be699f4e355R7349

#7 Updated by Pavan Rallabhandi over 1 year ago

@Abhishek Varshney Do you have details on how did you end up having leaked/stale bucket index entries for objects deleted, not necessarily with S3 though.

#8 Updated by Matt Benjamin over 1 year ago

  • Status changed from In Progress to Testing

#9 Updated by Matt Benjamin over 1 year ago

  • Assignee changed from Matt Benjamin to Casey Bodley

(ignore this one)

#10 Updated by Abhishek Varshney over 1 year ago

@Pavan : I am not sure why this happens. Once in a while, we see buckets with stale index entries. This mostly happens with multipart uploads. I would update this thread if I am able to reproduce and root-cause it. Do let me know if you get insights on the same.

#11 Updated by Pavan Rallabhandi over 1 year ago

Abhishek Varshney wrote:

@Pavan : I am not sure why this happens. Once in a while, we see buckets with stale index entries. This mostly happens with multipart uploads. I would update this thread if I am able to reproduce and root-cause it. Do let me know if you get insights on the same.

In case you didn't see this http://tracker.ceph.com/issues/20380

#12 Updated by Kefu Chai about 1 year ago

  • Status changed from Testing to Pending Backport
  • Assignee deleted (Casey Bodley)
  • Backport set to jewel

#13 Updated by Nathan Cutler about 1 year ago

  • Copied to Backport #21347: jewel: Orphan data gets leaked on Bucket deletion added

#15 Updated by Nathan Cutler about 1 month ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF