Bug #19432
closedrgw: folders starting with "_" underscore are not in bucket index
0%
Description
folders starting with _ underscore in the filename are not in the bucket index, therefore these objects can't be listed. I Believe this issue is also related to this http://tracker.ceph.com/issues/15562
Updated by Robin Johnson about 7 years ago
I cannot reproduce this on 10.2.6 with s3cmd. Can you please provide your script or detailed instructions to reproduce it?
Updated by Daniel Biazus about 7 years ago
Hey Robin,
1) Sync file.jpg to a new directory "_testing0001b"
/usr/bin/s3cmd -c 0001b.conf sync file.jpg s3://bucket-10001b/_testing0001b/file.jpg
upload: 'file.jpg' -> 's3://bucket-10001b/_testing0001b/file.jpg' [1 of 1]
49619 of 49619 100% in 0s 398.92 kB/s done
Done. Uploaded 49619 bytes in 1.0 seconds, 48.46 kB/s.
2) /usr/bin/s3cmd -c 0001b.conf ls s3://bucket-10001b/
DIR s3://bucket-10001b/tes0001/
DIR s3://bucket-10001b/tes_0001/
As You can see at this point, directory "_testing0001b" does not appear in the list.
3) /usr/bin/s3cmd -c 0001b.conf ls s3://bucket-10001b/_testing0001b/
2017-03-30 23:58 49619 s3://bucket-10001b/_testing0001b/file.jpg
But if We list directly the directory "_testing0001b", We can see the file there.
Obs: I was not able to reproduce this behaviour creating a new bucket from scratch. It started to happen after syncing 500 k objects to this bucket and after that, even deleting all of them we can't use underscore in the folder name.
I thought it would be something related to a corrupted index, but even running "radosgw-admin bucket check -b bucket-10001b --fix" I was not able to fix that.
We can see the file in bucket index, so I don't have any other clues.
radosgw-admin bucket check -b bucket-10001b | grep '_testing0001b'
"__testing0001b\/file.jpg",
Version:
ceph -v
ceph version 10.2.6 (656b5b63ed7c43bd014bcafd81b001959d5f089f)
Thanks!
Updated by Daniel Biazus about 7 years ago
Another interesting information, If I use two "__" underscores in the name, works as expected:
/usr/bin/s3cmd -c 0001b.conf ls s3://bucket-10001b/
DIR s3://bucket-10001b/__testing0002b/
DIR s3://bucket-10001b/__testing0003b/
DIR s3://bucket-10001b/tes0001/
DIR s3://bucket-10001b/tes_0001/
Regards,
Updated by Nathan Cutler about 7 years ago
- Project changed from Ceph to rgw
- Category deleted (
22) - Target version deleted (
v10.2.7) - Backport set to jewel
Updated by Daniel Biazus about 7 years ago
Guys, I was able to systematically reproduce this issue:
1) Create dummy file:
dd if=/dev/urandom of=4k.bin bs=4K count=1
2) Create a new bucket:
s3cmd mb s3://bucket-1000/
3) Upload 1001 files to the following directory "_medias/01"
for i in `seq 1 1001`; do s3cmd put 4k.bin s3://bucket-1000/_medias/01/4k-$i.bin ; done
4) List the objects from that directory. At this point We can see there is something wrong, because We can list just 1000 files:
s3cmd ls s3://bucket-1000/_medias/01/ | wc -l
1000
5)Upload 1001 files to the following directory "_medias/02"
for i in `seq 1 1001`; do s3cmd put 4k.bin s3://bucket-1000/_medias/02/4k-$i.bin ; done
6) At this point, We are no longer able to list directory "02"
s3cmd ls s3://bucket-1000/_medias/
DIR s3://bucket-1000/_medias/01/
7) Despite the fact that We uploaded 1001 file to directory "02", We can list just 1000:
s3cmd ls s3://bucket-1000/_medias/02/ | wc -l
1000
Obs: This behaviour just happened in directories with name starting with _ (underscore)
Please let me know if you need anything else to reproduce this.
Best Regards,
Daniel
Updated by Giovani Rinaldi about 7 years ago
Updated by Orit Wasserman about 7 years ago
I cannot reproduce it in master, it maybe only a jewel issue (we had a major cleanup in master)
Can you retry on latest jewel 10.2.6 and create a jewel specific fix.
Updated by Giovani Rinaldi about 7 years ago
No problem, thank you for the report. Here it is, the PR for jewel:
PR: https://github.com/ceph/ceph/pull/14368
Updated by Daniel Biazus about 7 years ago
I've just reproduced this also in Kraken 11.2.0.
Regards,
Updated by Nathan Cutler about 7 years ago
- Backport changed from jewel to jewel,kraken
Updated by Nathan Cutler about 7 years ago
- Copied to Backport #19563: jewel: rgw: folders starting with "_" underscore are not in bucket index added
Updated by Orit Wasserman almost 7 years ago
Sorry for the delay.
I have reproduced it on master with Daniel Biazus instructions.
I could not reproduce it with the fix proposed.
Sadly the fix caused a failure in s3tests that needs to be addressed before it can be merged.
The fix should go into master and than we need to backport it to Jewel and Kraken
Updated by Nathan Cutler almost 7 years ago
Updated by Nathan Cutler almost 7 years ago
- Status changed from New to Fix Under Review
Updated by Yehuda Sadeh almost 7 years ago
- Subject changed from folders starting with "_" underscore are not in bucket index to rgw: folders starting with "_" underscore are not in bucket index
Updated by Yuri Weinstein almost 7 years ago
https://github.com/ceph/ceph/pull/15916 merged 7/7/17
Updated by Nathan Cutler almost 7 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler almost 7 years ago
- Copied to Backport #20565: kraken: rgw: folders starting with "_" underscore are not in bucket index added
Updated by Nathan Cutler over 6 years ago
- Status changed from Pending Backport to Resolved