Bug #41011
openutf8 incompatibility in metadata added by rgw cloud sync module
0%
Description
Hi there,
The RGW cloud sync module seems to add the source key name as metadata when storing objects at the destination zone, by adding an "x-amz-meta-rgwx-source-key" header. However, object names can have characters in them that are illegal to use in S3 metadata.
Per https://docs.aws.amazon.com/AmazonS3/latest/dev/UsingMetadata.html#object-metadata all metadata submitted via the REST api must be ascii. I've tested this and S3 and another S3-compatible storage provider both throw SignatureDoesNotMatch errors. Ceph seems to allow utf-8 in metadata, so a rgw->rgw test likely wouldn't show this failure.
This likely affects every object with utf-8 characters in its key name. It may also break syncing for objects that were stored with utf-8 metadata, since rgw accepts it but S3 will not. I'm not sure of the best solution -- maybe just using url_encode on x-amz-meta-rgw-source-key and any attrs kept with keep_attr?
Please let me know if you need any more details. The simplest test case would be to create a single bucket with a single object with a utf-8 character in the key name and try to sync. The logging makes it hard to track down the issue otherwise.
Updated by Abhishek Lekshmanan over 4 years ago
- Status changed from New to 12
- Assignee set to Abhishek Lekshmanan