Bug #1287: Setting metadata with unreadable characters is not consistent with amazon S3 - rgw - Ceph

Actions

Copy link

Bug #1287

closed

Setting metadata with unreadable characters is not consistent with amazon S3

Added by Stephon Striplin almost 13 years ago. Updated over 6 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Colin McCabe

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

If you have a string like '\x04world', Amazon will encode it using MIME encoded-word syntax. Currently, our S3 implementation will either give a 403 error if the string begins or ends with an unreadable character, or it will store it as raw input, rather than use Amazon's method.

Actions

Copy link

Updated by Sage Weil almost 13 years ago

Target version set to v0.32

Actions

Copy link

Updated by Sage Weil almost 13 years ago

Category set to 22

Actions

Copy link

Updated by Colin McCabe almost 13 years ago

Amazon says (in the developers' guide):

> When uploading an object, you can assign metadata to the object. You 
> provide this optional information as a name, value pair when you send a 
> PUT or POST request to create the object. When uploading objects using the 
> REST API the optional user-defined metadata names must begin with 
> “x-amz-meta-“ to distinguish them as HTTP headers. When you retrieve the 
> object using the REST API, this prefix is returned. When uploading objects
> using the SOAP API, the prefix is not required and when you retrieve the
> object using the SOAP API, the prefix is removed, regardless of which API
> you used to upload the object.
>
> When metadata is retrieved through the REST API, Amazon S3 combines 
> headers that have the same name (ignoring case) into a comma-delimited 
> list. If some metadata contains unprintable characters, it is not 
> returned. Instead, the "x-amz-missing-meta" header is returned with a
> value of the number of the unprintable metadata entries.
>
> Each name, value pair must conform to US-ASCII when using REST and UTF-8 
> when using SOAP or browser-based uploads via POST.

Basically: you must use UTF-8 for metadata or suffer x-amz-missing-meta. MIME is not mentioned anywhere.

So maybe this is another case where there is a conflict between the spec and reality. We will have to test it.

Actions

Copy link

Updated by Colin McCabe almost 13 years ago

Confirmed through s3-tests. Amazon gives it back to you in mime-encoded format rather than giving you x-amz-missing.

Actions

Copy link

Updated by Sage Weil almost 13 years ago

Assignee set to Colin McCabe

Actions

Copy link

Updated by Colin McCabe almost 13 years ago

I wanted to be sure about this, so I verified using tcpdump that we were really sending the data over the wire not encoded. I confirmed that we are. So yes, the Amazon docs are wrong again, and RGW needs to learn how to do this encoding.

Actions

Copy link