Project

General

Profile

Actions

Bug #63153

open

Uploads by AWS Go SDK v2 fail with XAmzContentSHA256Mismatch when Checksum is requested

Added by Casey Bodley 7 months ago. Updated 3 months ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
sigv4 checksums backport_processed
Backport:
reef,quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

reported in https://github.com/ceph/ceph/pull/49986#issuecomment-1752997800:

This is the command I used to fuse mount a bucket:

mount-s3 --foreground --debug --endpoint-url https://s3.example.com --force-path-style somebuckettotest S3MOUNT

XAmzContentSHA256Mismatch (aka ERR_AMZ_CONTENT_SHA256_MISMATCH) indicates a mismatch between the checksum provided in the x-amz-content-sha256 header and the one calculated by AWSv4ComplSingle::complete() or AWSv4ComplMulti::complete()


Related issues 4 (3 open1 closed)

Related to rgw - Bug #63951: rgw: implement S3 additional checksumsFix Under ReviewMatt Benjamin

Actions
Has duplicate rgw - Bug #64090: RGW S3 signing regressionDuplicate

Actions
Copied to rgw - Backport #64465: reef: Uploads by AWS Go SDK v2 fail with XAmzContentSHA256Mismatch when Checksum is requestedNewMatt BenjaminActions
Copied to rgw - Backport #64466: quincy: Uploads by AWS Go SDK v2 fail with XAmzContentSHA256Mismatch when Checksum is requestedNewMatt BenjaminActions
Actions #1

Updated by Christian Rohmann 6 months ago

There now popped up an issue with terraform S3 backend not being compatible with "alternative" S3 implementations / services since they switched to AWS Go SDK v2 - https://github.com/hashicorp/terraform/issues/34053

Maybe(tm) the reasons could be related to this issue here?

Actions #2

Updated by Christian Rohmann 6 months ago

Christian Rohmann wrote:

There now popped up an issue with terraform S3 backend not being compatible with "alternative" S3 implementations / services since they switched to AWS Go SDK v2 - https://github.com/hashicorp/terraform/issues/34053

Maybe(tm) the reasons could be related to this issue here? See comment https://github.com/hashicorp/terraform/issues/34053#issuecomment-1776014523 for some thoughts on why this is...

Actions #3

Updated by Casey Bodley 6 months ago

reproduced against a vstart cluster (required adding rgw dns name = localhost for vhost access):

$ s3cmd mb s3://testbucket
Bucket 's3://testbucket/' created
$ AWS_ACCESS_KEY_ID=0555b35654ad1656d804 AWS_SECRET_ACCESS_KEY=h7GhxuBLTrlhVUyxSPUKUV8r/2EI4ngqJxD7iBdBYLhwluN30JaT3Q== mount-s3 --endpoint-url http://localhost:8000 testbucket mnt
$ cp 20m.iso mnt/
cp: failed to close 'mnt/20m.iso': Input/output error

this initiates a multipart upload. the part uploads have the following headers/environment:

2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 CONTENT_LENGTH=4194351
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 HTTP_ACCEPT=application/xml
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 HTTP_AUTHORIZATION=AWS4-HMAC-SHA256 Credential=0555b35654ad1656d804/20231031/us-east-1/s3/aws4_request, SignedHeaders=accept;content-encoding;content-length;host;x-amz-content-sha256;x-amz-date;x-amz-decoded-content-length;x-amz-trailer, Signature=f43986e1fee1401536ba539a2e6fb6bd12fdcce7c04d6e5a491174fc4b9e56f8
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 HTTP_CONTENT_ENCODING=aws-chunked
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 HTTP_HOST=testbucket.localhost:8000
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 HTTP_USER_AGENT=mountpoint-s3/1.1.0 mountpoint-s3-client/0.4.0-15bec26 CRTS3NativeClient/0.1.x
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 HTTP_VERSION=1.1
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 HTTP_X_AMZ_CONTENT_SHA256=STREAMING-UNSIGNED-PAYLOAD-TRAILER
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 HTTP_X_AMZ_DATE=20231031T122243Z
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 HTTP_X_AMZ_DECODED_CONTENT_LENGTH=4194304
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 HTTP_X_AMZ_TRAILER=x-amz-checksum-crc32c
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 QUERY_STRING=partNumber=3&uploadId=2~Wm6qKs5t_sE-UoOClwP9s6JgCJ7ivmL
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 REMOTE_ADDR=::1
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 REQUEST_METHOD=PUT
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 REQUEST_URI=/20m.iso?partNumber=3&uploadId=2~Wm6qKs5t_sE-UoOClwP9s6JgCJ7ivmL
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 SCRIPT_URI=/20m.iso
2023-10-31T08:22:43.565-0400 7fa482b1b6c0 20 SERVER_PORT=8000

the part uploads go on to fail due to unsupported x-amz-content-sha256: STREAMING-UNSIGNED-PAYLOAD-TRAILER:

2023-10-31T08:22:43.576-0400 7fa4bd3906c0  5 req 2887115635985913471 0.037001312s s3:put_obj NOTICE: call to do_aws4_auth_completion
2023-10-31T08:22:43.576-0400 7fa4bd3906c0 10 ERROR: x-amz-content-sha256 does not match
2023-10-31T08:22:43.576-0400 7fa4bd3906c0 10 ERROR:   grab_aws4_sha256_hash()=bba6a17dd494df90686e25a6eb66caf5f4f43891057e13c4734d5624661c1e07
2023-10-31T08:22:43.576-0400 7fa4bd3906c0 10 ERROR:   expected_request_payload_hash=STREAMING-UNSIGNED-PAYLOAD-TRAILER
2023-10-31T08:22:43.576-0400 7fa4bd3906c0 20 req 2887115635985913471 0.037001312s s3:put_obj get_data() returned ret=-2040

Actions #4

Updated by Casey Bodley 6 months ago

Christian Rohmann wrote:

Christian Rohmann wrote:

There now popped up an issue with terraform S3 backend not being compatible with "alternative" S3 implementations / services since they switched to AWS Go SDK v2 - https://github.com/hashicorp/terraform/issues/34053

Maybe(tm) the reasons could be related to this issue here? See comment https://github.com/hashicorp/terraform/issues/34053#issuecomment-1776014523 for some thoughts on why this is...

thanks, that's helpful. digging further, i found some details about the go sdk's implementation in https://github.com/aws/aws-sdk-go-v2/issues/1689

the pieces missing on rgw's end are:
Actions #5

Updated by Casey Bodley 6 months ago

  • Tags changed from sigv4 to sigv4 checksums
Actions #6

Updated by Christian Rohmann 6 months ago

With the terraform issue arising due to their switch to the current AWS Go SDK v2 see (https://github.com/hashicorp/terraform/issues/34053#issuecomment-1773253020), this will likely hot more and more tools making the same move.
Are there any plans on prioritizing the implementation of the missing check-summing feature(s) for RADOSGW and them being backported to Reef and Quincy at least?

Clouds / or storage services providing object storage via RADOSGW will otherwise likely face more and more user support questions. Even though it seems to be technically possible to work around this by explicitly disabling the check-summing (https://github.com/hashicorp/terraform/pull/34130). But that would require this being implemented in every tool using the AWS Go SDK v2 and users would then also need to first know about this requirement.

Actions #7

Updated by Christian Rohmann 6 months ago

Maybe it also makes sense to change the title of this issue to

"Uploads by AWS Go SDK v2 fail with XAmzContentSHA256Mismatch" ?

Likely AWS will at some point make similar changes to other SDKs (Java, NodeJS, Python, .NET, ...), breaking even more tools talking to Ceph RGWs.

Actions #8

Updated by Casey Bodley 6 months ago

Christian Rohmann wrote:

Are there any plans on prioritizing the implementation of the missing check-summing feature(s) for RADOSGW and them being backported to Reef and Quincy at least?

this feature was discussed during squid release planning in https://pad.ceph.com/p/rgw-cds2023. i've created a trello card for it in https://trello.com/c/U9tXJJgZ/875-rgw-s3-object-integrity

we generally don't backport these kind of features, but i'm open to discussing that once we have a working implementation

Actions #9

Updated by Casey Bodley 6 months ago

  • Subject changed from mountpoint-s3: uploads fail with XAmzContentSHA256Mismatch to Uploads by AWS Go SDK v2 fail with XAmzContentSHA256Mismatch when Checksum is requested
Actions #10

Updated by Casey Bodley 6 months ago

Christian Rohmann wrote:

Maybe it also makes sense to change the title of this issue to

"Uploads by AWS Go SDK v2 fail with XAmzContentSHA256Mismatch" ?

updated, with the caveat that the upload must opt-in to a Checksum algorithm

Actions #11

Updated by Matt Benjamin 6 months ago

  • Assignee set to Matt Benjamin
Actions #12

Updated by Casey Bodley 6 months ago

implementation notes:

accept STREAMING-UNSIGNED-PAYLOAD-TRAILER and STREAMING-AWS4-HMAC-SHA256-PAYLOAD-TRAILER values for x-amz-content-sha256 header

https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-auth-using-authorization-header.html#w28986aab9c25b9c32

AWSGeneralAbstractor::get_auth_data_v4() in rgw_rest_s3.cc is what checks for these header values in exp_payload_hash to decide which subclass of rgw::auth::Completer to instantiate

we currently create AWSv4ComplMulti for STREAMING-AWS4-HMAC-SHA256-PAYLOAD, nullptr for UNSIGNED-PAYLOAD, and AWSv4ComplSingle when x-amz-content-sha256 contains an actual sha256 hash

we'll need to use AWSv4ComplMulti for these new STREAMING-*-TRAILER variants as well, though the STREAMING-UNSIGNED-PAYLOAD-TRAILER one would modify the signature calculations - that probably just means calling calc_chunk_signature("UNSIGNED-PAYLOAD") instead of calc_chunk_signature(payload_hash)?

parse trailing headers for aws-chunked

https://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming-trailers.html

class AWSv4ComplMulti in rgw_auth_s3.* is responsible for this aws-chunked parsing. AWSv4ComplMulti::ChunkMeta::create_next() parses the chunk headers, and a 'chunk size' of 0 signifies the final chunk

when the x-amz-content-sha256 header uses one of the STREAMING-*-TRAILER variants, we should go on to parse the trailing headers and validate the x-amz-trailer-signature

in this mode, the x-amz-trailer request header will tell us at the beginning of the request which checksum algorithm to use (for example, it'll have a value like x-amz-checksum-crc32). then at the end, the trailing header will provide the actual checksum like x-amz-checksum-crc32:... which we can compare against the one we calculated

we could potentially implement the checksum calculations inside of AWSv4ComplMulti as it streams/parses the request body, but it's probably better to do that at a higher level that can be shared between the trailing- and non-trailing checksum modes

implement additional checksum algorithms

https://docs.aws.amazon.com/AmazonS3/latest/userguide/checking-object-integrity.html#using-additional-checksums

https://github.com/ceph/ceph/pull/30606 and https://github.com/ceph/ceph/pull/49986 already provide a good framework for the actual checksumming feature. the trailing checksum mode just means we won't know the expected checksum value until the end of the request body

Actions #13

Updated by Christian Rohmann 6 months ago

I don't know what the status of https://github.com/ceph/s3-tests is or what other unit tests suites are used on RGW.
But likely all of this (new) checksum stuff should lead to more test cases for compatibility as well.

Actions #14

Updated by Matt Benjamin 5 months ago

random data point.

I don't find the string STREAMING-AWS4-HMAC-SHA256-PAYLOAD in aws-sdk-go-v2, but certainly do find STREAMING-UNSIGNED-PAYLOAD-TRAILER .

I also found https://github.com/aws/aws-sdk-go-v2/issues/1667, and confirmed that as presently, we have:

    // Trailing checksums are only supported when TLS is enabled.
    if !req.IsHTTPS() {
        return out, metadata, computeInputTrailingChecksumError{
            Msg: "HTTPS required",
        }
    }

This helps explain why I was unable to generate a trailing checksum against my http rgw endpoint, with aws-sdk-go-v2 streaming upload (concurrent uploads w/uploadmanger) today ;)

Matt

Actions #15

Updated by Matt Benjamin 5 months ago

fwiw:

[mbenjamin@fedora aws-sdk-go-v2]$ rg -tgo STREAMING-AWS4
aws/signer/internal/v4/const.go
39:    StreamingEventsPayload = "STREAMING-AWS4-HMAC-SHA256-EVENTS" 

christian, not sure about s3-tests yet, but rgw is able to handle all the non-streaming/chunked scenarios for all checksums, as far as I can tell, I have boto3 and aws-sdk-go-v2 examples of (I think) all the cases.

Actions #16

Updated by Casey Bodley 5 months ago

  • Status changed from New to In Progress
Actions #17

Updated by Kyle Bader 5 months ago

I have an example (non-https) where we see a failure with aws-sdk-java/1.12.367

https://gist.githubusercontent.com/mmgaggle/22db687f9c7ad78cf00037a125eff086/raw/ee49920b8f591aa639d2f5b7761a3c43b0e73a3a/watsonx-failure.log

The request has x-amz-content-sha256:STREAMING-AWS4-HMAC-SHA256-PAYLOAD, and the request fails with 'ERROR: signature of last chunk does not match'

Actions #18

Updated by Matt Benjamin 5 months ago

  • Status changed from In Progress to Fix Under Review
  • Pull request ID set to 54856
Actions #19

Updated by Casey Bodley 3 months ago

  • Has duplicate Bug #64090: RGW S3 signing regression added
Actions #20

Updated by Christian Rohmann 3 months ago

Casey Bodley wrote:

Christian Rohmann wrote:

Are there any plans on prioritizing the implementation of the missing check-summing feature(s) for RADOSGW and them being backported to Reef and Quincy at least?

this feature was discussed during squid release planning in https://pad.ceph.com/p/rgw-cds2023. i've created a trello card for it in https://trello.com/c/U9tXJJgZ/875-rgw-s3-object-integrity

we generally don't backport these kind of features, but i'm open to discussing that once we have a working implementation

Since there are so many fixes to currently occurring checksum errors for various reasons, client libraries and tools I'd like to second my question about the potential to backport them to Reef.

Having RGW as compatible as possible to real S3 clients to me is not a "feature", but more of a bugfix. I suppose that is why most/all of the commit msg contain "fix" :-)

https://github.com/ceph/ceph/pull/54856/commits

Actions #21

Updated by Matt Benjamin 3 months ago

I support backport of all of the related fixes to reef

Matt

Actions #22

Updated by Matt Benjamin 3 months ago

  • Backport set to reef,quincy
Actions #23

Updated by Casey Bodley 3 months ago

  • Related to Bug #63951: rgw: implement S3 additional checksums added
Actions #24

Updated by Casey Bodley 3 months ago

  • Status changed from Fix Under Review to Pending Backport
Actions #25

Updated by Backport Bot 3 months ago

  • Copied to Backport #64465: reef: Uploads by AWS Go SDK v2 fail with XAmzContentSHA256Mismatch when Checksum is requested added
Actions #26

Updated by Backport Bot 3 months ago

  • Copied to Backport #64466: quincy: Uploads by AWS Go SDK v2 fail with XAmzContentSHA256Mismatch when Checksum is requested added
Actions #27

Updated by Backport Bot 3 months ago

  • Tags changed from sigv4 checksums to sigv4 checksums backport_processed
Actions

Also available in: Atom PDF