Bug #43583
closedrgw: unable to abort multipart upload after the bucket got resharded
0%
Description
There is a bug during the resharding for those multipart entries.
For all the multipart entries, the hash source should be the object name so that all those entries can still be
distributed to one same bucket index shard object.
Right now the code just calculate the shard id based on each entry's name, which is wrong
This can cause the bucket not able to abort the multipart upload and leave the stale multiple entries behind.
Updated by dongdong tao over 4 years ago
Updated by Casey Bodley over 4 years ago
- Priority changed from Normal to High
- Tags set to reshard multipart
- Backport set to nautilus
Updated by dongdong tao over 4 years ago
@Casey, will this also be backported to luminous ?
May i know is there any plan for 12.2.13 ?
Updated by J. Eric Ivancich over 4 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 32617
Updated by J. Eric Ivancich over 4 years ago
- Backport changed from nautilus to nautilus,mimic
Updated by Casey Bodley about 4 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler about 4 years ago
- Copied to Backport #43846: nautilus: rgw: unable to abort multipart upload after the bucket got resharded added
Updated by Nathan Cutler about 4 years ago
- Copied to Backport #43847: mimic: rgw: unable to abort multipart upload after the bucket got resharded added
Updated by Casey Bodley about 4 years ago
- Related to Bug #43756: An error occurred (NoSuchUpload) when calling the AbortMultipartUpload operation: Unknown added
Updated by Nathan Cutler about 4 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Updated by Manuel Rios about 4 years ago
We updated today the cluster to 14.2.8 that apply this backport.
Now LC show more information but also this new errors and continue unable to abort.
2020-03-03 18:13:19.361 7fb58bcfb6c0 5 lifecycle: ERROR: abort_multipart_upload failed, ret=-2009, meta:_multipart_MBS-0fc78b70-efa6-49ef-bdd2-fd3a4b4f2c84/CBB_BIM-AUTOLOG/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001$/20190315230135/160.cbrevision.2~5nOv_6K_GZVwAJNqmEZ RrmE4lMs_-91.meta
2020-03-03 18:13:19.361 7fb58bcfb6c0 20 obj_has_expired(): mtime=2019-03-16 00:53:58.0.940346s days=1 base_time=2020-03-03 00:00:00.000000 timediff=3.0496e+07 cmp=86400
2020-03-03 18:13:19.362 7fb58bcfb6c0 20 abort_multipart_upload: list_multipart_parts returned -2
2020-03-03 18:13:19.362 7fb58bcfb6c0 5 lifecycle: ERROR: abort_multipart_upload failed, ret=-2009, meta:_multipart_MBS-0fc78b70-efa6-49ef-bdd2-fd3a4b4f2c84/CBB_BIM-AUTOLOG/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001$/20190315230135/160.cbrevision.2~67RyQVXdhT-g3Jp1V88 cNHCkv6ly_tt.meta
2020-03-03 18:13:19.362 7fb58bcfb6c0 20 obj_has_expired(): mtime=2019-03-16 00:18:28.0.7263s days=1 base_time=2020-03-03 00:00:00.000000 timediff=3.04981e+07 cmp=86400
2020-03-03 18:13:19.362 7fb58bcfb6c0 20 abort_multipart_upload: list_multipart_parts returned -2
2020-03-03 18:13:19.362 7fb58bcfb6c0 5 lifecycle: ERROR: abort_multipart_upload failed, ret=-2009, meta:_multipart_MBS-0fc78b70-efa6-49ef-bdd2-fd3a4b4f2c84/CBB_BIM-AUTOLOG/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001$/20190315230135/160.cbrevision.2~EJaUoXHzAJikdRspX1H bpopE1ZbdCih.meta
2020-03-03 18:13:19.362 7fb58bcfb6c0 20 obj_has_expired(): mtime=2019-03-16 01:41:18.0.929875s days=1 base_time=2020-03-03 00:00:00.000000 timediff=3.04931e+07 cmp=86400
2020-03-03 18:13:19.363 7fb58bcfb6c0 20 abort_multipart_upload: list_multipart_parts returned -2
2020-03-03 18:13:19.363 7fb58bcfb6c0 5 lifecycle: ERROR: abort_multipart_upload failed, ret=-2009, meta:_multipart_MBS-0fc78b70-efa6-49ef-bdd2-fd3a4b4f2c84/CBB_BIM-AUTOLOG/CBB_DiskImage/Disk_00000000-0000-0000-0000-000000000000/Volume_NTFS_00000000-0000-0000-0000-000000000001$/20190315230135/160.cbrevision.2~IZ9dHhPZHmJivZSDqvG kJILjn_tDFZP.meta
Lifecycle applied.
<?xml version="1.0" ?>
<LifecycleConfiguration xmlns="http://s3.amazonaws.com/doc/2006-03-01/">
<Rule>
<ID>Incomplete Multipart Uploads</ID>
<Prefix/>
<Status>Enabled</Status>
<AbortIncompleteMultipartUpload>
<DaysAfterInitiation>1</DaysAfterInitiation>
</AbortIncompleteMultipartUpload>
</Rule>
</LifecycleConfiguration>
Updated by dongdong tao about 4 years ago
@Manuel Rios
You have list_multipart_parts returned -2, which means your .meta object in non-ec pool should already be deleted.
Please note that this fix won't let you abort those multipart which already failed to abort before (cause the failed abort already deleted the .meta object).
For those old failed multipart abortion, you'll have to manually clear the them.
This fix will make sure those new partial completed multipart will abort successfully