Feature #53455
open[RFE] Ill-formatted JSON response from RGW
0%
Description
Requesting returned JSON output be properly formatted.
Use case: accessing a world readable bucket using curl (not using standard aws/s3cmd tools)
E.g., user would like to get JSON output
$ curl -v -H "Accept: application/json" http://s3.local* Uses proxy env variable no_proxy == '127.0.0.1,169.254.169.254,acc.local,capc.local,abc.local,adns.local,cloud.local,dev.local,bx.local,dnf.local,localhost,localhost,prod.local'- Trying 192.168.1.4:80...
- TCP_NODELAY set
- Connected to s3.local (192.168.1.4) port 80 (#0)
GET / HTTP/1.1> Host: s3.local
User-Agent: curl/7.68.0
Accept: application/json
- Mark bundle as not supporting multiuse
< HTTP/1.1 200 OK
< transfer-encoding: chunked
< x-amz-request-id: tx000000000000002ba2cb2-0061609980-75a869cf-xxx1-dev
< Content-Type: application/xml
< date: Fri, 08 Oct 2021 19:18:24 GMT
< {"Name":"go-releases","Prefix":"","Marker":"","MaxKeys":1000,"IsTruncated":"false","Contents":["go1.15.15.linux-amd64.tar.gz","2021-10-04T19:06:18.416Z","\"b75227438c6129b5013da053b3aa3f38\"",121104410,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.15.15.src.tar.gz","2021-10-04T19:06:31.422Z","\"05fedd8289291eb2d91cd0c092b41aaa\"",23042945,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.16.8.linux-amd64.tar.gz","2021-10-04T19:03:13.184Z","\"d8c51d1744c7f1fec07e80434652d5e2\"",129030171,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.16.8.src.tar.gz","2021-10-04T19:03:26.938Z","\"92e69a5e1bb6ea5e7498d12d03160032\"",20922236,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.17.1.linux-amd64.tar.gz","2021-10-04T18:56:09.720Z","\"ede85550b8c5436f8188338567c6448b\"",134784143,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.17.1.src.tar.gz","2021-10-04T18:53:06.59* Closing connection 03Z","\"a78205838c2a7054522cb91c12982f26\"",22181735,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.17.2.linux-amd64.tar.gz","2021-10-11T16:13:15.383Z","\"b7af894763e397335efe5a9ca70a5d63\"",134803982,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}],"Contents":["go1.17.2.src.tar.gz","2021-10-11T16:13:27.531Z","\"1b3be8eb35ad2fa31ec50c09cd00bcde\"",22182111,"STANDARD",{"ID":"3026937","DisplayName":"3026937-abc-bbgo"}]}
The returned data is JSON-like but not quite. Note the "Contents" key.
The key is repeated multiple times.
Moreover, as seen in the above verbose output, RGW states the "Content-Type" as "application/xml" although user requested JSON and RGW returned a JSON output but not XML.
-----
Additional Info:
Issue described observed in Luminous env. Same use case was tried on test bucket on Nautilus RGWs and the results are the same.
Sample bucket with 3 objects:
10GBFile
my500MBFile
a
-----
TEST1: aws cli - all 3 files are listed as expected
$ aws --endpoint-url=http://s3.staging.local/ s3
ls s3://test
2021-10-01 18:33:47 10485760000 10GBFile
2021-10-08 19:59:46 2 a
2021-10-08 20:07:48 536870912 my500MBFile
TEST2: curl with no headers - valid XML output which can be converted to JSON
$ curl http://s3.staging.local | xq | jq '.ListBucketResult.Contents[].Key'
"10GBFile"
"a"
"my500MBFile"
TEST3: curl asking JSON - invalid JSON format
$ curl -H "Accept: application/json" http://s3.staging.local
{"Name":"test","Prefix":"","MaxKeys":1000,"IsTruncated":"false","Contents":["10GBFile","2021-10-01T18:33:47.433Z","\"c008a29949a18ac24ff1e79a0fd8abf3\"",10485760000,"STANDARD",{"ID":"3091912","DisplayName":"3091912-abc-staging-test"},"Normal"],"Contents":["a","2021-10-08T19:59:46.606Z","\"60b725f10c9c85c70d97880dfe8191b3\"",2,"STANDARD",{"ID":"3091912","DisplayName":"3091912-abc-staging-test"},"Normal"],"Contents":["my500MBFile","2021-10-08T20:07:48.771Z","\"6d1954a8c7d6f09434c1ba4745a86869-64\"",536870912,"STANDARD",{"ID":"3091912","DisplayName":"3091912-abc-staging-test"},"Normal"],"Marker":""}
The "Contents" key is repeated multiple times per-object - which is not expected JSON formatting. Looking at the verbose output, "Content-Type: application/xml" is seen in the response while curl specifies JSON format in the header.So, when parsing the returned output, the standard tools can parse in only 1 object (Content) because of the unexpected format but not all 3.
$ curl -H "Accept: application/json" http://s3.staging.local | jq '.Contents'
[
"my500MBFile",
"2021-10-08T20:07:48.771Z",
"\"6d1954a8c7d6f09434c1ba4745a86869-64\"",
536870912,
"STANDARD",
{
"ID": "3091912",
"DisplayName": "3091912-abc-staging-test"
},
"Normal"
]