rgw: folders starting with "_" underscore are not in bucket index
folders starting with _ underscore in the filename are not in the bucket index, therefore these objects can't be listed. I Believe this issue is also related to this http://tracker.ceph.com/issues/15562
#2 Updated by Daniel Biazus almost 2 years ago
1) Sync file.jpg to a new directory "_testing0001b"
/usr/bin/s3cmd -c 0001b.conf sync file.jpg s3://bucket-10001b/_testing0001b/file.jpg
upload: 'file.jpg' -> 's3://bucket-10001b/_testing0001b/file.jpg' [1 of 1]
49619 of 49619 100% in 0s 398.92 kB/s done
Done. Uploaded 49619 bytes in 1.0 seconds, 48.46 kB/s.
2) /usr/bin/s3cmd -c 0001b.conf ls s3://bucket-10001b/
As You can see at this point, directory "_testing0001b" does not appear in the list.
3) /usr/bin/s3cmd -c 0001b.conf ls s3://bucket-10001b/_testing0001b/
2017-03-30 23:58 49619 s3://bucket-10001b/_testing0001b/file.jpg
But if We list directly the directory "_testing0001b", We can see the file there.
Obs: I was not able to reproduce this behaviour creating a new bucket from scratch. It started to happen after syncing 500 k objects to this bucket and after that, even deleting all of them we can't use underscore in the folder name.
I thought it would be something related to a corrupted index, but even running "radosgw-admin bucket check -b bucket-10001b --fix" I was not able to fix that.
We can see the file in bucket index, so I don't have any other clues.
radosgw-admin bucket check -b bucket-10001b | grep '_testing0001b'
ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)
#3 Updated by Daniel Biazus almost 2 years ago
Another interesting information, If I use two "__" underscores in the name, works as expected:
/usr/bin/s3cmd -c 0001b.conf ls s3://bucket-10001b/
#5 Updated by Daniel Biazus almost 2 years ago
Guys, I was able to systematically reproduce this issue:
1) Create dummy file:
dd if=/dev/urandom of=4k.bin bs=4K count=1
2) Create a new bucket:
s3cmd mb s3://bucket-1000/
3) Upload 1001 files to the following directory "_medias/01"
for i in `seq 1 1001`; do s3cmd put 4k.bin s3://bucket-1000/_medias/01/4k-$i.bin ; done
4) List the objects from that directory. At this point We can see there is something wrong, because We can list just 1000 files:
s3cmd ls s3://bucket-1000/_medias/01/ | wc -l
5)Upload 1001 files to the following directory "_medias/02"
for i in `seq 1 1001`; do s3cmd put 4k.bin s3://bucket-1000/_medias/02/4k-$i.bin ; done
6) At this point, We are no longer able to list directory "02"
s3cmd ls s3://bucket-1000/_medias/
7) Despite the fact that We uploaded 1001 file to directory "02", We can list just 1000:
s3cmd ls s3://bucket-1000/_medias/02/ | wc -l
Obs: This behaviour just happened in directories with name starting with _ (underscore)
Please let me know if you need anything else to reproduce this.
#13 Updated by Orit Wasserman over 1 year ago
Sorry for the delay.
I have reproduced it on master with Daniel Biazus instructions.
I could not reproduce it with the fix proposed.
Sadly the fix caused a failure in s3tests that needs to be addressed before it can be merged.
The fix should go into master and than we need to backport it to Jewel and Kraken