Project

General

Profile

Actions

Bug #40587

closed

rgw: fix drain handles error when deleting bucket with bypass-gc option

Added by dongdong tao almost 5 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
nautilus,mimic,luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When we use command "radosgw-admin bucket rm --bucket=<bucket name> --bypass-gc --purge-objects" to delete a very large bucket.
It might take very long time to finish since it will bypass the gc, so when the deleting process is terminated because of some reason (eg. machine reboot, administrator just killed it.)

That is highly possible to leave a partial deleted rgw object (only deleted some shadow object, and the head object is still there).

When the next time, the administrator want to continue to run "radosgw-admin bucket rm --bucket=<bucket name> --bypass-gc --purge-objects" to delete the rest of the object in this bucket.
The process will error out like below:
2019-06-28 11:55:23.587429 7f9ab8265e40 -1 ERROR: could not drain handles as aio completion returned with -2
2019-06-28 11:55:23.588152 7f9ab8265e40 -1 ERROR: unable to remove bucket(2) No such file or director

This happened because radosgw-admin will try to delete the partial deleted rgw object that left from last time.
And then it do that, it will try to delete all its shadow objects from beginning again,
Since those front shadow objects were already deleted from last time, it will certainly got -ENOENT when it calls drain_handles which will make the whole process exit.

Above is the scenario we met which makes bypass-gc deletion for large bucket not working at all.

Even in general, I think we should skip the -ENOENT return value here, because "-ENOENT" just mean the object is already deleted
And the function rgw_remove_bucket_bypass_gc is just doing the deletion, there should be no difference for rgw_remove_bucket_bypass_gc treating return values between "0" and "-ENOENT"


Related issues 3 (0 open3 closed)

Copied to rgw - Backport #41109: nautilus: rgw: fix drain handles error when deleting bucket with bypass-gc optionResolvedNathan CutlerActions
Copied to rgw - Backport #41110: mimic: rgw: fix drain handles error when deleting bucket with bypass-gc optionResolvedPrashant DActions
Copied to rgw - Backport #41111: luminous: rgw: fix drain handles error when deleting bucket with bypass-gc optionResolvedActions
Actions #1

Updated by dongdong tao almost 5 years ago

I will submit a PR soon

Actions #3

Updated by Casey Bodley almost 5 years ago

  • Status changed from New to 7
Actions #4

Updated by J. Eric Ivancich over 4 years ago

  • Status changed from 7 to Pending Backport
  • Target version set to v15.0.0
  • Backport set to nautilus,mimic,luminous
  • Pull request ID set to 28789
Actions #5

Updated by Patrick Donnelly over 4 years ago

  • Copied to Backport #41109: nautilus: rgw: fix drain handles error when deleting bucket with bypass-gc option added
Actions #6

Updated by Patrick Donnelly over 4 years ago

  • Copied to Backport #41110: mimic: rgw: fix drain handles error when deleting bucket with bypass-gc option added
Actions #7

Updated by Patrick Donnelly over 4 years ago

  • Copied to Backport #41111: luminous: rgw: fix drain handles error when deleting bucket with bypass-gc option added
Actions #8

Updated by Nathan Cutler over 4 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF