Project

General

Profile

Bug #41011

utf8 incompatibility in metadata added by rgw cloud sync module

Added by Ed Fisher 3 months ago. Updated 2 months ago.

Status:
Verified
Priority:
Normal
Target version:
-
Start date:
07/30/2019
Due date:
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

Hi there,

The RGW cloud sync module seems to add the source key name as metadata when storing objects at the destination zone, by adding an "x-amz-meta-rgwx-source-key" header. However, object names can have characters in them that are illegal to use in S3 metadata.

Per https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-metadata all metadata submitted via the REST api must be ascii. I've tested this and S3 and another S3-compatible storage provider both throw SignatureDoesNotMatch errors. Ceph seems to allow utf-8 in metadata, so a rgw->rgw test likely wouldn't show this failure.

This likely affects every object with utf-8 characters in its key name. It may also break syncing for objects that were stored with utf-8 metadata, since rgw accepts it but S3 will not. I'm not sure of the best solution -- maybe just using url_encode on x-amz-meta-rgw-source-key and any attrs kept with keep_attr?

Please let me know if you need any more details. The simplest test case would be to create a single bucket with a single object with a utf-8 character in the key name and try to sync. The logging makes it hard to track down the issue otherwise.

History

#1 Updated by Abhishek Lekshmanan 2 months ago

  • Status changed from New to Verified
  • Assignee set to Abhishek Lekshmanan

Also available in: Atom PDF