Project

General

Profile

Bug #24821

rgw doesnt support delimiter longer then one symbol

Added by ryci us almost 3 years ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
nautilus,mimic,luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

If i list bucket with delimiter containing only one symbol everything is ok:

> GET /50ff0d1e-1e53-4b48-a79e-4682a1be5da4?delimiter=a&prefix=hh HTTP/1.1
> Host: 127.0.0.1:7480
> User-Agent: curl/7.47.0
> Accept: */*
> Date: Sun, 08 Jul 2018 08:56:22 +0000
> Authorization: AWS 3624AHCIMD70GX8WAFXM:M9itfrTqLcvIjVEWga+WaGdDz6o=
> 
< HTTP/1.1 200 OK
< x-amz-request-id: tx000000000000000019553-005b41d1b6-dfaed0-default
< Content-Type: application/xml
< Content-Length: 287
< Date: Sun, 08 Jul 2018 08:56:22 GMT
< 
* Connection #0 to host 127.0.0.1 left intact
<?xml version="1.0" encoding="UTF-8"?><ListBucketResult xmlns="http://s3.amazonaws.com/doc/2006-03-01/"><Name>50ff0d1e-1e53-4b48-a79e-4682a1be5da4</Name><Prefix>hh</Prefix><Marker></Marker><MaxKeys>1000</MaxKeys><Delimiter>a</Delimiter><IsTruncated>false</IsTruncated></ListBucketResult># 

If i list bucket with delimiter containing more then one symbol i get 400 Bad request:

> GET /50ff0d1e-1e53-4b48-a79e-4682a1be5da4?delimiter=aa&prefix=hh HTTP/1.1
> Host: 127.0.0.1:7480
> User-Agent: curl/7.47.0
> Accept: */*
> Date: Sun, 08 Jul 2018 08:56:30 +0000
> Authorization: AWS 3624AHCIMD70GX8WAFXM:U8dxsH6mgKOp1U5lpMjuSnWJ0d8=
> 
< HTTP/1.1 400 Bad Request
< Content-Length: 253
< x-amz-request-id: tx000000000000000001d58-005b41d1be-dfd6b8-default
< Accept-Ranges: bytes
< Content-Type: application/xml
< Date: Sun, 08 Jul 2018 08:56:30 GMT
< 
* Connection #0 to host 127.0.0.1 left intact
<?xml version="1.0" encoding="UTF-8"?><Error><Code>InvalidArgument</Code><BucketName>50ff0d1e-1e53-4b48-a79e-4682a1be5da4</BucketName><RequestId>tx000000000000000001d58-005b41d1be-dfd6b8-default</RequestId><HostId>dfd6b8-default-default</HostId></Error>


Related issues

Copied to rgw - Backport #38775: luminous: rgw doesnt support delimiter longer then one symbol Rejected
Copied to rgw - Backport #38776: mimic: rgw doesnt support delimiter longer then one symbol Resolved
Copied to rgw - Backport #38777: nautilus: rgw doesnt support delimiter longer then one symbol Resolved

History

#1 Updated by Chang Liu almost 3 years ago

`A delimiter is a character you use to group keys.`

FYI: https://docs.aws.amazon.com/AmazonS3/latest/API/v2-RESTBucketGET.html

#2 Updated by ryci us almost 3 years ago

Yes, i saw it, but look - Type: String, and Amazon S3 takes accepts any length string as delimiter.
Some software use delimiter with more then 1 symbol and it makes incompatibility with rgw s3.

#3 Updated by Matt Benjamin over 2 years ago

  • Status changed from New to 12
  • Assignee set to Matt Benjamin

I haven't automated a reproducer, but I have a reproducer for the proximate cause of the issue, which is related to behavior in RGWRados::Bucket::List::list_objects_ordered (as it's known now that there is a distinction between ordered and unordered listing).

A 2014 change in the ancestor of this method introduced logic which attempts to conditionally advance the marker returned from CLS, if the listing uses delimter and marker contains an instance of prefix. The logic updating marker uses decode_ and encode_utf8 to obtain a string "after" delimiter lexically--but incrementing the numeric value of marker as written generally doesn't work as expected--in particular, decode_utf8 doesn't actually take arbitrary utf-8 sequences in the manner the code expects. After discussion, I'm inclined to avoid changing marker altogether, assuming it can be proved the enumeration will terminate. IIUC (not a certainty), this is an optimization only (attempt to prove by construction below).

[mbenjamin@lemon python]$ ./boto3_delim.py
booger1_o/0
booger1_odelimstr0
booger1_o/1
booger1_odelimstr1
booger1_o/2
booger1_odelimstr2
booger1_o/3
booger1_odelimstr3
booger1_o/4
booger1_odelimstr4
booger1_o/5
booger1_odelimstr5
booger1_o/6
booger1_odelimstr6
booger1_o/7
booger1_odelimstr7
booger1_o/8
booger1_odelimstr8
booger1_o/9
booger1_odelimstr9
--Return--

/home/mbenjamin/dev/rgw/s3_py/python/boto3_delim.py(48)<module>()->None

-> import pdb; pdb.set_trace()

(Pdb) res1['Contents']
[{u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 49, 986000, tzinfo=tzutc()), u'ETag': '"fae5a6cd8860a054e995074ebbdfed96"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr0', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 10000, tzinfo=tzutc()), u'ETag': '"42ebe1ea120446769900301adc024fbd"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr1', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 33000, tzinfo=tzutc()), u'ETag': '"239b00602a834b95b969cff88b2f3a0b"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr2', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 57000, tzinfo=tzutc()), u'ETag': '"674c10103d7897b8cb432bbee25a337b"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr3', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 82000, tzinfo=tzutc()), u'ETag': '"928f354462bac7f8c54165d3ff54abf8"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr4', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 110000, tzinfo=tzutc()), u'ETag': '"f0337c3c6f8748bb22cd9082bcdcf141"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr5', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 139000, tzinfo=tzutc()), u'ETag': '"61887c860aa8b01d66c2b003c22ac8c1"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr6', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 164000, tzinfo=tzutc()), u'ETag': '"5bd2c5172b0fa130afcf462500c1c2a8"', u'StorageClass'
: 'STANDARD', u'Key': u'booger1_odelimstr7', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 190000, tzinfo=tzutc()), u'ETag': '"04c6b5752e9ee1e398b26a77f07aface"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr8', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 214000, tzinfo=tzutc()), u'ETag': '"bba67116d7f4d851ec87b0ff4c78fad8"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr9', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}]
(Pdb) res2['Contents']
[{u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 49, 972000, tzinfo=tzutc()), u'ETag': '"3f8b20cbb1f737642aac6bf8c60a7292"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/0', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 49, 998000, tzinfo=tzutc()), u'ETag': '"4ffaac0b975094d602e20d8bef0d5d21"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/1', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 21000, tzinfo=tzutc()), u'ETag': '"92ddaf02b0976ba3bcf468d8c8fa1058"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/2', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 47000, tzinfo=tzutc()), u'ETag': '"02b711bf2813232e913e9ee8e325dc84"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/3', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 69000, tzinfo=tzutc()), u'ETag': '"65ae618e5ebf4230ceb5a1920627bbfe"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/4', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 96000, tzinfo=tzutc()), u'ETag': '"33b93c887b04c3055d43f697b2198551"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/5', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 124000, tzinfo=tzutc()), u'ETag': '"d8513f670a3f9284ee351575c3be8a48"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/6', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 151000, tzinfo=tzutc()), u'ETag': '"02dbf66f6e9f56ac8d3d162cdf6dc97f"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/7', u'Owner': {
u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 177000, tzinfo=tzutc()), u'ETag': '"63fa5214e65428f57591d1a964e6d91c"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/8', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}, {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 202000, tzinfo=tzutc()), u'ETag': '"9b694e3829d1d99ee83adc61a0ecccb7"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/9', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}]
(Pdb) res2['Contents'][0] {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 49, 972000, tzinfo=tzutc()), u'ETag': '"3f8b20cbb1f737642aac6bf8c60a7292"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/0', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}
(Pdb) res2['Contents'][1] {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 49, 998000, tzinfo=tzutc()), u'ETag': '"4ffaac0b975094d602e20d8bef0d5d21"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/1', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}
(Pdb) res2['Contents'][2] {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 21000, tzinfo=tzutc()), u'ETag': '"92ddaf02b0976ba3bcf468d8c8fa1058"', u'StorageClass': 'STANDARD', u'Key': u'booger1_o/2', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 20}
(Pdb) res1['Contents'][0] {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 49, 986000, tzinfo=tzutc()), u'ETag': '"fae5a6cd8860a054e995074ebbdfed96"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr0', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}
(Pdb) res1['Contents'][1] {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 10000, tzinfo=tzutc()), u'ETag': '"42ebe1ea120446769900301adc024fbd"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr1', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}
(Pdb) res1['Contents'][2] {u'LastModified': datetime.datetime(2019, 1, 22, 14, 18, 50, 33000, tzinfo=tzutc()), u'ETag': '"239b00602a834b95b969cff88b2f3a0b"', u'StorageClass': 'STANDARD', u'Key': u'booger1_odelimstr2', u'Owner': {u'DisplayName': 'M. Tester', u'ID': 'testid'}, u'Size': 27}
(Pdb)

#4 Updated by Matt Benjamin about 2 years ago

  • Status changed from 12 to Fix Under Review

#5 Updated by Matt Benjamin about 2 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to [nautilus],mimic,luminous

#6 Updated by Nathan Cutler about 2 years ago

  • Backport changed from [nautilus],mimic,luminous to nautilus,mimic,luminous

#7 Updated by Nathan Cutler about 2 years ago

  • Copied to Backport #38775: luminous: rgw doesnt support delimiter longer then one symbol added

#8 Updated by Nathan Cutler about 2 years ago

  • Copied to Backport #38776: mimic: rgw doesnt support delimiter longer then one symbol added

#9 Updated by Nathan Cutler about 2 years ago

  • Copied to Backport #38777: nautilus: rgw doesnt support delimiter longer then one symbol added

#10 Updated by Nathan Cutler about 2 years ago

  • Pull request ID set to 26863

#11 Updated by Nathan Cutler 3 months ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF