Bug #43583
closed
rgw: unable to abort multipart upload after the bucket got resharded
Added by dongdong tao over 4 years ago.
Updated about 4 years ago.
Description
There is a bug during the resharding for those multipart entries.
For all the multipart entries, the hash source should be the object name so that all those entries can still be
distributed to one same bucket index shard object.
Right now the code just calculate the shard id based on each entry's name, which is wrong
This can cause the bucket not able to abort the multipart upload and leave the stale multiple entries behind.
I will open a pull request soon
- Priority changed from Normal to High
- Tags set to reshard multipart
- Backport set to nautilus
@Casey, will this also be backported to luminous ?
May i know is there any plan for 12.2.13 ?
- Status changed from New to Fix Under Review
- Pull request ID set to 32617
- Backport changed from nautilus to nautilus,mimic
- Assignee set to J. Eric Ivancich
- Status changed from Fix Under Review to Pending Backport
- Copied to Backport #43846: nautilus: rgw: unable to abort multipart upload after the bucket got resharded added
- Copied to Backport #43847: mimic: rgw: unable to abort multipart upload after the bucket got resharded added
- Related to Bug #43756: An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown added
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
We updated today the cluster to 14.2.8 that apply this backport.
Now LC show more information but also this new errors and continue unable to abort.
2020-03-03 18:13:19.361 7fb58bcfb6c0 5 lifecycle: ERROR: abort_multipart_upload failed, ret=-2009, meta:_multipart_MBS-0fc78b70-efa6-49ef-bdd2-fd3a4b4f2c84/CBB_BIM-AUTOLOG/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001$/20190315230135/160.cbrevision.2~5nOv_6K_GZVwAJNqmEZ RrmE4lMs_-91.meta
2020-03-03 18:13:19.361 7fb58bcfb6c0 20 obj_has_expired(): mtime=2019-03-16 00:53:58.0.940346s days=1 base_time=2020-03-03 00:00:00.000000 timediff=3.0496e+07 cmp=86400
2020-03-03 18:13:19.362 7fb58bcfb6c0 20 abort_multipart_upload: list_multipart_parts returned -2
2020-03-03 18:13:19.362 7fb58bcfb6c0 5 lifecycle: ERROR: abort_multipart_upload failed, ret=-2009, meta:_multipart_MBS-0fc78b70-efa6-49ef-bdd2-fd3a4b4f2c84/CBB_BIM-AUTOLOG/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001$/20190315230135/160.cbrevision.2~67RyQVXdhT-g3Jp1V88 cNHCkv6ly_tt.meta
2020-03-03 18:13:19.362 7fb58bcfb6c0 20 obj_has_expired(): mtime=2019-03-16 00:18:28.0.7263s days=1 base_time=2020-03-03 00:00:00.000000 timediff=3.04981e+07 cmp=86400
2020-03-03 18:13:19.362 7fb58bcfb6c0 20 abort_multipart_upload: list_multipart_parts returned -2
2020-03-03 18:13:19.362 7fb58bcfb6c0 5 lifecycle: ERROR: abort_multipart_upload failed, ret=-2009, meta:_multipart_MBS-0fc78b70-efa6-49ef-bdd2-fd3a4b4f2c84/CBB_BIM-AUTOLOG/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001$/20190315230135/160.cbrevision.2~EJaUoXHzAJikdRspX1H bpopE1ZbdCih.meta
2020-03-03 18:13:19.362 7fb58bcfb6c0 20 obj_has_expired(): mtime=2019-03-16 01:41:18.0.929875s days=1 base_time=2020-03-03 00:00:00.000000 timediff=3.04931e+07 cmp=86400
2020-03-03 18:13:19.363 7fb58bcfb6c0 20 abort_multipart_upload: list_multipart_parts returned -2
2020-03-03 18:13:19.363 7fb58bcfb6c0 5 lifecycle: ERROR: abort_multipart_upload failed, ret=-2009, meta:_multipart_MBS-0fc78b70-efa6-49ef-bdd2-fd3a4b4f2c84/CBB_BIM-AUTOLOG/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001$/20190315230135/160.cbrevision.2~IZ9dHhPZHmJivZSDqvG kJILjn_tDFZP.meta
Lifecycle applied.
<?xml version="1.0" ?>
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>Incomplete Multipart Uploads</ID>
<Prefix/>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>1</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
@Manuel Rios
You have list_multipart_parts returned -2, which means your .meta object in non-ec pool should already be deleted.
Please note that this fix won't let you abort those multipart which already failed to abort before (cause the failed abort already deleted the .meta object).
For those old failed multipart abortion, you'll have to manually clear the them.
This fix will make sure those new partial completed multipart will abort successfully
Also available in: Atom
PDF