Bug #16712
closedmultisite: 400-error with certain complete multipart upload requests
0%
Description
Sometimes complete multipart upload returns a 400 error, saying incomplete parts, even though the client has got a 200 status for each of the multipart upload in a remote site. For eg. in a multipart upload with 7 parts, with eacho f the 7 parts getting a 200 success, complete_multipart request returns a 400, The rados objects on disk look like the following when (& after) the complete multipart is issued,
$ ./bin/rados -c ./run/c3/ceph.conf -p us-1.rgw.buckets.non-ec listomapkeys 3d3b1eaa-83d7-4670-ac72-cb3f29aae547.4237.4__multipart_object_4.2~pmDIYYCpr6S1QptPFjKxfwllVFBy3AD.meta part.00000002 part.00000003 part.00000004 part.00000005 part.00000006 part.00000007 $./bin/rados -c ./run/c3/ceph.conf -p us-3.rgw.buckets.non-ec listomapkeys 3d3b1eaa-83d7-4670-ac72-cb3f29aae547.4237.4__multipart_object_4.2~pmDIYYCpr6S1QptPFjKxfwllVFBy3AD.meta part.00000001
the multipart upload only finds 6 parts because the buckets non-ec pool in the third zone only sees the parts 2-7 and doesn't see part1
Updated by Abhishek Lekshmanan almost 8 years ago
- Subject changed from mulisite: issue with certain multipart uploads to mulisite: 400-error with certain complete multipart upload requests
Updated by Abhishek Lekshmanan almost 8 years ago
- Subject changed from mulisite: 400-error with certain complete multipart upload requests to multisite: 400-error with certain complete multipart upload requests
Updated by Casey Bodley almost 8 years ago
Abhishek Lekshmanan wrote:
the multipart upload only finds 6 parts because the buckets non-ec pool in the third zone only sees the parts 2-7 and doesn't see part1
Presumably, us-3.rgw.buckets.non-ec is the correct pool for zone us-3 to be writing its parts to. The us-1.rgw.buckets.non-ec pool should belong to zone us-1 and only exist within cluster c1, right? So it looks like part1 is placed correctly, while the rest are not. Would it be possible to provide a log for the gateway in question?
Updated by Abhishek Lekshmanan almost 8 years ago
uploaded via ceph-post-file at fbb3cff0-2dcb-42c3-ba12-68e18472415c
Updated by Abhishek Lekshmanan almost 8 years ago
looking at the rgw.meta pool object for the corresponding bucket, we actually store the data_extra_pool as the primary zone's data_extra pool, not sure who/how this value was changed
Updated by Abhishek Lekshmanan almost 8 years ago
also the value of bucket instance at this point (in domain root pool) also says the data extra pool as us-1.rgw.buckets.non-ec (ie. the primary zone non-ec pool)
Updated by Abhishek Lekshmanan almost 8 years ago
- Status changed from New to In Progress
- Assignee set to Abhishek Lekshmanan
I'm guessing this is happening when we update the bucket instance, we don't copy over the source data_extra pool and this causes the issue.
Updated by Abhishek Lekshmanan almost 8 years ago
master pr: https://github.com/ceph/ceph/pull/10397
Updated by Abhishek Lekshmanan almost 8 years ago
- Copied to Backport #16778: multisite: 400-error with certain complete multipart upload requests added
Updated by Abhishek Lekshmanan over 7 years ago
- Status changed from In Progress to Pending Backport
Updated by Loïc Dachary over 7 years ago
- Status changed from Pending Backport to Resolved