Project

General

Profile

Actions

Bug #51767

closed

missing CommonPrefixes with some shard count

Added by JS Landry almost 3 years ago. Updated over 2 years ago.

Status:
Closed
Priority:
Normal
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-ansible
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi, I got this problems for several days and I can't find a solution.

I had a bucket with 11 shards and everything was ok. I reshard it to 23 and the the customer quickly reply that "two directories are now empty".
Trying to fix the situation, I did many reshards on this bucket and the highest I could go now is 10, otherwise 2 prefix return empty listing.

I test the bucket with postman, and what I found is that when adding a delimiter to the url: https://s3.example.com/bucketname/?prefix=carto/DATA/site/Works/vq/&delimiter=/
I don't received the same xml if the shard count is 10 or 23. (xml files attached)

the GET, when the bucket have 10 shards, return an xml that include all the "<CommonPrefixes>" tags for every "sub-directory" in /vq/,
but using the same GET when the bucket have 23 shards, return an xml without those "<CommonPrefixes>" tags.

That's the reason why the directory looks empty, but how can the xml be different? Nothing has changed except the sharding.

When using a prefix only url: https://s3.example.com/bucketname/?prefix=carto/DATA/site/Works/vq/
using 10 or 23 shards, the xml returned are identical.

We are running octopus 15.2.13, default shard number is 11, some buckets have 23, 29, 41, 60 shards.
No problems reported by the users, except for this case only. To my knowledge it's the only bucket having this problems.

Everything is looking fine when testing with radosgw-admin.
bi list, bucket list, bucket radoslist, or even the rados listomapskeys, with 10 or 23 shards, the lists are identical.

for this bucket,
shard of 10 (or less), everything is ok
shard of 11, 12, 13: only "carto/DATA/site/Works/vq/" is empty listing.
shard of 17 (or more): both "carto/DATA/site/Works/vq/" AND "carto/DATA/site/Works/" are empty.

The listing is empty but when you know the object path, you can get the object without any errors.
I did set debug_rgw 5/5, but I can't find anything there. (partial logfiles attached)
I would greatly appreciated some help with this.
Cheers!


Files

rgw-shard0.log (6.66 KB) rgw-shard0.log rgw/beast log when testing with postman, 0 shard JS Landry, 07/21/2021 03:04 PM
rgw-shard23.log (9.28 KB) rgw-shard23.log rgw/beast log when testing with postman, 23 shards JS Landry, 07/21/2021 03:04 PM
postman-get-prefix-delimiter-shard23-pp.xml (651 Bytes) postman-get-prefix-delimiter-shard23-pp.xml postman xml output for a get using prefix and delimiter on the 23 shards bucket JS Landry, 07/21/2021 03:04 PM
postman-get-prefix-delimiter-shard10-pp.xml (1.97 KB) postman-get-prefix-delimiter-shard10-pp.xml postman xml output for a get using prefix and delimiter on the 10 shards bucket JS Landry, 07/21/2021 03:04 PM
ceph-rgw-ul-stk-pr-ccr01.rgw0.log.level20.13shards.anon.gz (8.71 KB) ceph-rgw-ul-stk-pr-ccr01.rgw0.log.level20.13shards.anon.gz no CommonPrefixes for carto/DATA/ulaval/Rasters/vq/ only JS Landry, 08/02/2021 03:45 PM
ceph-rgw-ul-stk-pr-ccr01.rgw0.log.level20.17shards.anon.gz (11.4 KB) ceph-rgw-ul-stk-pr-ccr01.rgw0.log.level20.17shards.anon.gz no CommonPrefixes for carto/DATA/ulaval/Rasters/ and carto/DATA/ulaval/Rasters/vq/ JS Landry, 08/02/2021 03:45 PM
ceph-rgw-ul-stk-pr-ccr01.rgw0.log.level20.10shards.anon.gz (8.83 KB) ceph-rgw-ul-stk-pr-ccr01.rgw0.log.level20.10shards.anon.gz everything is ok JS Landry, 08/02/2021 03:45 PM
radosgw-list_object-and-cls_bucket_list-logfile.log.anon.gz (83.5 KB) radosgw-list_object-and-cls_bucket_list-logfile.log.anon.gz JS Landry, 08/24/2021 01:52 PM
Actions

Also available in: Atom PDF