Project

General

Profile

Actions

Bug #38700

closed

silent corruption using SSE-C on multi-part upload to S3 with non-default part size

Added by László van den Hoek about 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
High
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus,mimic,luminous
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Multi-part uploads to S3 on RGW silently cause corruption if you use a non-default chunk size, e.g. 5242881 bytes, which is the default plus one.

I ran into this using Alpakka S3, which can be used to upload streams of (beforehand) unknown size to S3. This triggers the bug because only a minimum chunk size can be specified, not an exact size.

This problem does not occur with AWS S3.

Steps to reproduce:

  • configure AWS CLI - provide credentials and use region default:
    aws configure
  • Create a SSE-C key. Corruption will occur consistently with any key, but different keys will yield distinct corruption patterns.
    dd if=/dev/zero of=/tmp/ssec.key bs=1 count=32
  • Generate test data - 10MB of NUL bytes (0x00):
    dd if=/dev/zero of=/tmp/foo bs=10000 count=1000
  • Upload and download the file - this will be a multi-part upload because the input file is larger than the default multi-part chunk size (5242880, i.e. 5 *1024*1024 = 80 * 2^16 bytes)
    aws s3 cp /tmp/foo s3://<bucket_name>/ --sse-c AES256 --sse-c-key fileb:///tmp/ssec.key --endpoint-url https://<rados gateway host>
    aws s3 cp s3://<bucket_name>/foo /tmp/bar --sse-c AES256 --sse-c-key fileb:///tmp/ssec.key --endpoint-url https://<rados gateway host>
  • Verify file integrity - no surprises here:
    sha256sum /tmp/foo -> f5e02aa71e67f41d79023a128ca35bad86cf7b6656967bfe0884b3a3c4325eaf
    sha256sum /tmp/bar -> f5e02aa71e67f41d79023a128ca35bad86cf7b6656967bfe0884b3a3c4325eaf

Now, it gets interesting.

  • Set the chunk size to a non-default value and repeat the process:
    aws configure set default.s3.multipart_chunksize 5242881
    aws s3 cp /tmp/foo s3://<bucket_name>/ --sse-c AES256 --sse-c-key fileb:///tmp/ssec.key --endpoint-url https://<rados gateway host>
    aws s3 cp s3://<bucket_name>/foo /tmp/bar --sse-c AES256 --sse-c-key fileb:///tmp/ssec.key --endpoint-url https://<rados gateway host>
  • Now, the downloaded file is corrupt!
    sha256sum /tmp/foo -> f5e02aa71e67f41d79023a128ca35bad86cf7b6656967bfe0884b3a3c4325eaf
    sha256sum /tmp/bar -> 2351da404fde47c360d201cf77311afd1d5e8cbfb601a1db1cee0b8f82124554

Taking a closer look at the files:

  • There are indeed 10M bytes in the file:
    cat foo | wc -c
  • Filter out all NUL bytes and count the remainder; this outputs 0, confirming that the original foo contains only NUL bytes:
    tr -d '\000' < foo | wc -c

Then, from the corrupted bar file, we take the first n bytes, filter the NULs, and count how many remain.

  • For the first chunksize bytes, there are 0, meaning the contents of the first chunk are not corrupted:
    head -c 5242881 bar | tr -d '\000' | wc -c
  • Take one more byte, and the output changes to 1, meaning corruption starts at the part uploaded second:
    head -c 5242882 bar | tr -d '\000' | wc -c

Examining bar in a file editor shows that the remainder of the file is indeed garbage.

Uploading and downloading the file multiple times with a particular SSE-C key yields the exact same result, so this is a deterministic bug.


Related issues 3 (0 open3 closed)

Copied to rgw - Backport #39068: nautilus: silent corruption using SSE-C on multi-part upload to S3 with non-default part sizeResolvedAbhishek LekshmananActions
Copied to rgw - Backport #39069: mimic: silent corruption using SSE-C on multi-part upload to S3 with non-default part sizeResolvedAbhishek LekshmananActions
Copied to rgw - Backport #39070: luminous: silent corruption using SSE-C on multi-part upload to S3 with non-default part sizeResolvedAbhishek LekshmananActions
Actions

Also available in: Atom PDF