Feature #53455
open[RFE] Ill-formatted JSON response from RGW
0%
Description
Requesting returned JSON output be properly formatted.
Use case: accessing a world readable bucket using curl (not using standard aws/s3cmd tools)
E.g., user would like to get JSON output
$ curl -v -H "Accept: application/json" http://s3.local* Uses proxy env variable no_proxy == '127.0.0.1,169.254.169.254,acc.local,capc.local,abc.local,adns.local,cloud.local,dev.local,bx.local,dnf.local,localhost,localhost,prod.local'- Trying 192.168.1.4:80...
- TCP_NODELAY set
- Connected to s3.local (192.168.1.4) port 80 (#0)
GET / HTTP/1.1> Host: s3.local
User-Agent: curl/7.68.0
Accept: application/json
- Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< transfer-encoding: chunked
< x-amz-request-id: tx000000000000002ba2cb2-0061609980-75a869cf-xxx1-dev
< Content-Type: application/xml
< date: Fri, 08 Oct 2021 19:18:24 GMT
< {"Name":"go-releases","Prefix":"","Marker":"","MaxKeys":1000,"IsTruncated":"false","Contents":["go1.15.15.linux-amd64.tar.gz","2021-10-04T19:06:18.416Z","\"b75227438c6129b5013da053b3aa3f38\"",121104410,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.15.15.src.tar.gz","2021-10-04T19:06:31.422Z","\"05fedd8289291eb2d91cd0c092b41aaa\"",23042945,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.16.8.linux-amd64.tar.gz","2021-10-04T19:03:13.184Z","\"d8c51d1744c7f1fec07e80434652d5e2\"",129030171,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.16.8.src.tar.gz","2021-10-04T19:03:26.938Z","\"92e69a5e1bb6ea5e7498d12d03160032\"",20922236,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.17.1.linux-amd64.tar.gz","2021-10-04T18:56:09.720Z","\"ede85550b8c5436f8188338567c6448b\"",134784143,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.17.1.src.tar.gz","2021-10-04T18:53:06.59* Closing connection 03Z","\"a78205838c2a7054522cb91c12982f26\"",22181735,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.17.2.linux-amd64.tar.gz","2021-10-11T16:13:15.383Z","\"b7af894763e397335efe5a9ca70a5d63\"",134803982,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.17.2.src.tar.gz","2021-10-11T16:13:27.531Z","\"1b3be8eb35ad2fa31ec50c09cd00bcde\"",22182111,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}]}
The returned data is JSON-like but not quite. Note the "Contents" key.
The key is repeated multiple times.
Moreover, as seen in the above verbose output, RGW states the "Content-Type" as "application/xml" although user requested JSON and RGW returned a JSON output but not XML.
-----
Additional Info:
Issue described observed in Luminous env. Same use case was tried on test bucket on Nautilus RGWs and the results are the same.
Sample bucket with 3 objects:
10GBFile
my500MBFile
a
-----
TEST1: aws cli - all 3 files are listed as expected
$ aws --endpoint-url=http://s3.staging.local/ s3
ls s3://test
2021-10-01 18:33:47 10485760000 10GBFile
2021-10-08 19:59:46 2 a
2021-10-08 20:07:48 536870912 my500MBFile
TEST2: curl with no headers - valid XML output which can be converted to JSON
$ curl http://s3.staging.local | xq | jq '.ListBucketResult.Contents[].Key'
"10GBFile"
"a"
"my500MBFile"
TEST3: curl asking JSON - invalid JSON format
$ curl -H "Accept: application/json" http://s3.staging.local
{"Name":"test","Prefix":"","MaxKeys":1000,"IsTruncated":"false","Contents":["10GBFile","2021-10-01T18:33:47.433Z","\"c008a29949a18ac24ff1e79a0fd8abf3\"",10485760000,"STANDARD",{"ID":"3091912","DisplayName":"3091912-abc-staging-test"},"Normal"],"Contents":["a","2021-10-08T19:59:46.606Z","\"60b725f10c9c85c70d97880dfe8191b3\"",2,"STANDARD",{"ID":"3091912","DisplayName":"3091912-abc-staging-test"},"Normal"],"Contents":["my500MBFile","2021-10-08T20:07:48.771Z","\"6d1954a8c7d6f09434c1ba4745a86869-64\"",536870912,"STANDARD",{"ID":"3091912","DisplayName":"3091912-abc-staging-test"},"Normal"],"Marker":""}
The "Contents" key is repeated multiple times per-object - which is not expected JSON formatting. Looking at the verbose output, "Content-Type: application/xml" is seen in the response while curl specifies JSON format in the header.So, when parsing the returned output, the standard tools can parse in only 1 object (Content) because of the unexpected format but not all 3.
$ curl -H "Accept: application/json" http://s3.staging.local | jq '.Contents'
[
"my500MBFile",
"2021-10-08T20:07:48.771Z",
"\"6d1954a8c7d6f09434c1ba4745a86869-64\"",
536870912,
"STANDARD",
{
"ID": "3091912",
"DisplayName": "3091912-abc-staging-test"
},
"Normal"
]
Updated by Matt Benjamin over 1 year ago
- Status changed from New to Fix Under Review
- Pull request ID set to 47632
Updated by Matt Benjamin over 1 year ago
- Trying 10.17.152.22:8000...
- Connected to lemon (10.17.152.22) port 8000 (#0)
GET /works3 HTTP/1.1
Host: lemon:8000
User-Agent: curl/7.69.1
Accept: application/json
- Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< Transfer-Encoding: chunked
< x-amz-request-id: tx00000087601a154f225dd-0062fb9d31-104a-default
< Content-Type: application/json
< Date: Tue, 16 Aug 2022 13:35:45 GMT
< Connection: Keep-Alive
< - Connection #0 to host lemon left intact {"Name":"works3","Prefix":"","MaxKeys":1000,"IsTruncated":"false","Contents":[{"pass1":{"LastModified":"2022-08-15T17:22:01.609Z","ETag":"\"271ec5227ced4bd1f5de40e6d665b3e1\"","Size":4458,"StorageClass":"STANDARD","Owner":{"ID":"testid","DisplayName":"M. Tester"},"Type":"Normal"}},{"pass2":{"LastModified":"2022-08-16T12:53:10.667Z","ETag":"\"271ec5227ced4bd1f5de40e6d665b3e1\"","Size":4458,"StorageClass":"STANDARD","Owner":{"ID":"testid","DisplayName":"M. Tester"},"Type":"Normal"}},{"pass3":{"LastModified":"2022-08-16T12:53:22.206Z","ETag":"\"271ec5227ced4bd1f5de40e6d665b3e1\"","Size":4458,"StorageClass":"STANDARD","Owner":{"ID":"testid","DisplayName":"M. Tester"},"Type":"Normal"}},""]
Updated by Matt Benjamin over 1 year ago
oops, that should be:
[mbenjamin@lemon build]$ curl -v -H "Accept: application/json" http://lemon:8000/works3 * Trying 10.17.152.22:8000... * Connected to lemon (10.17.152.22) port 8000 (#0) > GET /works3 HTTP/1.1 > Host: lemon:8000 > User-Agent: curl/7.69.1 > Accept: application/json > * Mark bundle as not supporting multiuse < HTTP/1.1 200 OK < Transfer-Encoding: chunked < x-amz-request-id: tx0000044b8a721dae7c144-0062fba926-104f-default < Content-Type: application/json < Date: Tue, 16 Aug 2022 14:26:46 GMT < Connection: Keep-Alive < * Connection #0 to host lemon left intact {"Name":"works3","Prefix":"","MaxKeys":1000,"IsTruncated":"false","Contents":[{"pass1":{"LastModified":"2022-08-15T17:22:01.609Z","ETag":"\"271ec5227ced4bd1f5de40e6d665b3e1\"","Size":4458,"StorageClass":"STANDARD","Owner":{"ID":"testid","DisplayName":"M. Tester"},"Type":"Normal"}},{"pass2":{"LastModified":"2022-08-16T12:53:10.667Z","ETag":"\"271ec5227ced4bd1f5de40e6d665b3e1\"","Size":4458,"StorageClass":"STANDARD","Owner":{"ID":"testid","DisplayName":"M. Tester"},"Type":"Normal"}},{"pass3":{"LastModified":"2022-08-16T12:53:22.206Z","ETag":"\"271ec5227ced4bd1f5de40e6d665b3e1\"","Size":4458,"StorageClass":"STANDARD","Owner":{"ID":"testid","DisplayName":"M. Tester"},"Type":"Normal"}}],"Marker":""}
Updated by Matt Benjamin over 1 year ago
Updated slightly to match output of awscli, so:
{ "Name": "works3", "Prefix": "", "MaxKeys": 1000, "IsTruncated": "false", "Contents": [ { "Key": "pass1", "LastModified": "2022-08-15T17:22:01.609Z", "ETag": "\"271ec5227ced4bd1f5de40e6d665b3e1\"", "Size": 4458, "StorageClass": "STANDARD", "Owner": { "ID": "testid", "DisplayName": "M. Tester" }, "Type": "Normal" }, { "Key": "pass2", "LastModified": "2022-08-16T12:53:10.667Z", "ETag": "\"271ec5227ced4bd1f5de40e6d665b3e1\"", "Size": 4458, "StorageClass": "STANDARD", "Owner": { "ID": "testid", "DisplayName": "M. Tester" }, "Type": "Normal" } ], "Marker": "" }
Updated by Casey Bodley over 1 year ago
- Status changed from Fix Under Review to Resolved
- Backport deleted (
quincy)
@Ken why tag this for quincy and not pacific? this is an RFE so wouldn't normally qualify for backports
@Matt Li the Content-Type bug fix from https://tracker.ceph.com/issues/55680 is now pending backport to pacific+quincy. if you'd like this extra RFE part to be backported, you can update https://tracker.ceph.com/issues/55680 to point at the full PR https://github.com/ceph/ceph/pull/47632 instead of the partial PR https://github.com/ceph/ceph/pull/47577
Updated by Ken Dreyer over 1 year ago
- Backport set to quincy
This is currently targeted for RH Ceph Storage 6 (https://bugzilla.redhat.com/2028220), which is based on quincy. If we can backport it there, it should be acceptable to ship in an upstream quincy release.
Updated by Ken Dreyer over 1 year ago
- Status changed from Resolved to Pending Backport
Updated by Ken Dreyer over 1 year ago
If I'm missing more information here, please let me know.
Updated by Backport Bot over 1 year ago
- Copied to Backport #57349: quincy: [RFE] Ill-formatted JSON response from RGW added