Project

General

Profile

Bug #19432

rgw: folders starting with "_" underscore are not in bucket index

Added by Daniel Biazus 7 months ago. Updated about 2 months ago.

Status:
Resolved
Priority:
High
Target version:
-
Start date:
03/30/2017
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
jewel,kraken
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-deploy
Release:
jewel
Needs Doc:
No

Description

folders starting with _ underscore in the filename are not in the bucket index, therefore these objects can't be listed. I Believe this issue is also related to this http://tracker.ceph.com/issues/15562


Related issues

Copied to rgw - Backport #19563: jewel: rgw: folders starting with "_" underscore are not in bucket index Resolved
Copied to rgw - Backport #20565: kraken: rgw: folders starting with "_" underscore are not in bucket index Rejected

History

#1 Updated by Robin Johnson 7 months ago

I cannot reproduce this on 10.2.6 with s3cmd. Can you please provide your script or detailed instructions to reproduce it?

#2 Updated by Daniel Biazus 7 months ago

Hey Robin,

1) Sync file.jpg to a new directory "_testing0001b"

/usr/bin/s3cmd -c 0001b.conf sync file.jpg s3://bucket-10001b/_testing0001b/file.jpg
upload: 'file.jpg' -> 's3://bucket-10001b/_testing0001b/file.jpg' [1 of 1]
49619 of 49619 100% in 0s 398.92 kB/s done
Done. Uploaded 49619 bytes in 1.0 seconds, 48.46 kB/s.

2) /usr/bin/s3cmd -c 0001b.conf ls s3://bucket-10001b/
DIR s3://bucket-10001b/tes0001/
DIR s3://bucket-10001b/tes_0001/

As You can see at this point, directory "_testing0001b" does not appear in the list.

3) /usr/bin/s3cmd -c 0001b.conf ls s3://bucket-10001b/_testing0001b/

2017-03-30 23:58 49619 s3://bucket-10001b/_testing0001b/file.jpg

But if We list directly the directory "_testing0001b", We can see the file there.

Obs: I was not able to reproduce this behaviour creating a new bucket from scratch. It started to happen after syncing 500 k objects to this bucket and after that, even deleting all of them we can't use underscore in the folder name.
I thought it would be something related to a corrupted index, but even running "radosgw-admin bucket check -b bucket-10001b --fix" I was not able to fix that.

We can see the file in bucket index, so I don't have any other clues.

radosgw-admin bucket check -b bucket-10001b | grep '_testing0001b'
"__testing0001b\/file.jpg",

Version:
ceph -v
ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)

Thanks!

#3 Updated by Daniel Biazus 7 months ago

Another interesting information, If I use two "__" underscores in the name, works as expected:

/usr/bin/s3cmd -c 0001b.conf ls s3://bucket-10001b/
DIR s3://bucket-10001b/__testing0002b/
DIR s3://bucket-10001b/__testing0003b/
DIR s3://bucket-10001b/tes0001/
DIR s3://bucket-10001b/tes_0001/
Regards,

#4 Updated by Nathan Cutler 7 months ago

  • Project changed from Ceph to rgw
  • Category deleted (radosgw)
  • Target version deleted (v10.2.7)
  • Backport set to jewel

#5 Updated by Daniel Biazus 7 months ago

Guys, I was able to systematically reproduce this issue:

1) Create dummy file:

dd if=/dev/urandom of=4k.bin bs=4K count=1

2) Create a new bucket:

s3cmd mb s3://bucket-1000/

3) Upload 1001 files to the following directory "_medias/01"

for i in `seq 1 1001`; do s3cmd put 4k.bin s3://bucket-1000/_medias/01/4k-$i.bin ; done

4) List the objects from that directory. At this point We can see there is something wrong, because We can list just 1000 files:

s3cmd ls s3://bucket-1000/_medias/01/ | wc -l
1000

5)Upload 1001 files to the following directory "_medias/02"

for i in `seq 1 1001`; do s3cmd put 4k.bin s3://bucket-1000/_medias/02/4k-$i.bin ; done

6) At this point, We are no longer able to list directory "02"

s3cmd ls s3://bucket-1000/_medias/

DIR s3://bucket-1000/_medias/01/

7) Despite the fact that We uploaded 1001 file to directory "02", We can list just 1000:

s3cmd ls s3://bucket-1000/_medias/02/ | wc -l
1000

Obs: This behaviour just happened in directories with name starting with _ (underscore)

Please let me know if you need anything else to reproduce this.

Best Regards,
Daniel

#7 Updated by Orit Wasserman 7 months ago

I cannot reproduce it in master, it maybe only a jewel issue (we had a major cleanup in master)
Can you retry on latest jewel 10.2.6 and create a jewel specific fix.

#8 Updated by Giovani Rinaldi 7 months ago

No problem, thank you for the report. Here it is, the PR for jewel:
PR: https://github.com/ceph/ceph/pull/14368

#9 Updated by Daniel Biazus 7 months ago

I've just reproduced this also in Kraken 11.2.0.

Regards,

#10 Updated by Nathan Cutler 7 months ago

  • Backport changed from jewel to jewel,kraken

#11 Updated by Nathan Cutler 7 months ago

  • Copied to Backport #19563: jewel: rgw: folders starting with "_" underscore are not in bucket index added

#12 Updated by Yehuda Sadeh 6 months ago

  • Assignee set to Orit Wasserman

#13 Updated by Orit Wasserman 5 months ago

Sorry for the delay.

I have reproduced it on master with Daniel Biazus instructions.
I could not reproduce it with the fix proposed.
Sadly the fix caused a failure in s3tests that needs to be addressed before it can be merged.
The fix should go into master and than we need to backport it to Jewel and Kraken

#15 Updated by Nathan Cutler 4 months ago

  • Status changed from New to Need Review

#16 Updated by Yehuda Sadeh 4 months ago

  • Subject changed from folders starting with "_" underscore are not in bucket index to rgw: folders starting with "_" underscore are not in bucket index

#18 Updated by Nathan Cutler 4 months ago

  • Status changed from Need Review to Pending Backport

#19 Updated by Nathan Cutler 4 months ago

  • Copied to Backport #20565: kraken: rgw: folders starting with "_" underscore are not in bucket index added

#20 Updated by Nathan Cutler about 2 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF