Project

General

Profile

Actions

Bug #55288

closed

rgw/dbstore: handle prefix/delim in Bucket::list operation

Added by Giuseppe Baccini about 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
% Done:

0%

Source:
Tags:
dbstore
Backport:
Regression:
No
Severity:
2 - major
Reviewed:
04/12/2022
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

GET operation over a bucket on dbstore implementation, is producing a wrong set of items.
For example, if you feed two different backend, one using rados and the second using dbstore, with the same data, you will get two different output for GET:

RADOS

s3cmd ls s3://test/inner/

2022-04-08 15:17            0  s3://test/inner/myf3
2022-04-08 15:17            0  s3://test/inner/myf4
2022-04-08 15:17            0  s3://test/inner/myf5
2022-04-08 15:17           25  s3://test/inner/myfile

DBSTORE

s3cmd ls s3://test/inner/

2022-04-08 15:16            0  s3://test/inner/myf3
2022-04-08 15:16            0  s3://test/inner/myf4
2022-04-08 15:16            0  s3://test/inner/myf5
2022-04-08 15:16           25  s3://test/inner/myfile
2022-04-08 15:16            0  s3://test/myf6
2022-04-08 15:16            0  s3://test/myf7

As you can see, dbstore is wrongly returning also the parent's objects, those inside the parent directory.
It seems that GET for rados has the concept of directory and it is able to present objects as a DIR; this is not happening with dbstore:

RADOS

s3cmd ls s3://test

                          DIR  s3://test/inner/
2022-04-08 15:17            0  s3://test/myf6
2022-04-08 15:17            0  s3://test/myf7

DBSTORE

s3cmd ls s3://test

2022-04-08 15:16            0  s3://test/inner/myf3
2022-04-08 15:16            0  s3://test/inner/myf4
2022-04-08 15:16            0  s3://test/inner/myf5
2022-04-08 15:16           25  s3://test/inner/myfile
2022-04-08 15:16            0  s3://test/myf6
2022-04-08 15:16            0  s3://test/myf7

This is leading to a quite notable and dangerous behavior on delete operations that, when are recursive, they rely on the set computed by a GET.
So, if you issue the following command:

s3cmd rm --recursive --force s3://test/inner

on dbstore, you will have the entire bucket's content being deleted.

I'm working on a patch that is preventing the most dangerous effects, but that is not covering all the aspects.
Solving this entirely probably will involve some design choices.

Actions #1

Updated by Casey Bodley about 2 years ago

  • Assignee set to Soumya Koduri
  • Tags set to dbstore
Actions #2

Updated by Matt Benjamin about 2 years ago

Seems a bit overdone. It sounds like this is an issue with prefix/delimiter handling? That's all S3 knows about "directories."

Matt

Actions #3

Updated by Soumya Koduri about 2 years ago

  • Subject changed from rgw/dbstore: GET operation on bucket is faulty (dbstore be only) to rgw/dbstore: handle prefix/delim in Buucket::list operation

Matt Benjamin wrote:

Seems a bit overdone. It sounds like this is an issue with prefix/delimiter handling? That's all S3 knows about "directories."

Matt

yes. DBStore is still WIP project and not yet fully s3-complaint. Filtering of bucket listing output based on prefix/delim is not yet handled.

Actions #4

Updated by Soumya Koduri about 2 years ago

  • Pull request ID set to 45909

https://github.com/ceph/ceph/pull/45909 should fix this issue. Please check.

Will handle delim as well and update the PR.

Actions #5

Updated by Soumya Koduri about 2 years ago

  • Subject changed from rgw/dbstore: handle prefix/delim in Buucket::list operation to rgw/dbstore: handle prefix/delim in Bucket::list operation
Actions #6

Updated by Giuseppe Baccini about 2 years ago

Hi Soumya, thank you very much for your patch.
I try it and it seems to me that it solves the most annoying issues (fix crash and delete --recursive is now working as intended).
Unfortunately, this patch (as the one that I try to do by myself some day ago), suffers of a (minor?) problem:
If you ls a "directory" you will match also all its subdirectory items, eg:

s3cmd --no-ssl -c s3cmd.cfg ls s3://test

2022-04-15 10:00            0  s3://test/inner/inner_2/subfile
2022-04-15 10:00            0  s3://test/inner/myf3
2022-04-15 10:00            0  s3://test/inner/myf4
2022-04-15 10:00            0  s3://test/inner/myf5
2022-04-15 10:00           25  s3://test/inner/myfile
2022-04-15 10:00            0  s3://test/myf6
2022-04-15 10:00       479544  s3://test/myf7

This is not happening with rados be (it will show inner as a DIR).
Do you think can we do something to narrow the behavior for the two be implementations?

Actions #7

Updated by Soumya Koduri about 2 years ago

Giuseppe Baccini wrote:

Hi Soumya, thank you very much for your patch.
I try it and it seems to me that it solves the most annoying issues (fix crash and delete --recursive is now working as intended).
Unfortunately, this patch (as the one that I try to do by myself some day ago), suffers of a (minor?) problem:
If you ls a "directory" you will match also all its subdirectory items, eg:

[...]

This is not happening with rados be (it will show inner as a DIR).
Do you think can we do something to narrow the behavior for the two be implementations?

Handling 'delim' seems to be taking care of this. I have updated the patch in the PR. Please test it once.

Actions #8

Updated by Giuseppe Baccini about 2 years ago

Changes tested, for me everything works as expected.

Actions #9

Updated by Soumya Koduri about 2 years ago

  • Status changed from New to Resolved

Giuseppe Baccini wrote:

Changes tested, for me everything works as expected.

Thanks for confirming.

Actions

Also available in: Atom PDF