Project

General

Profile

Actions

Bug #51466

closed

rgw: cls_bucket_list_unordered() might return repeating or partial results when enable index shard

Added by 鹏 张 almost 3 years ago. Updated over 1 year ago.

Status:
Resolved
Priority:
Normal
Target version:
-
% Done:

100%

Source:
Tags:
unordered-listing
Backport:
pacific, octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

cls_bucket_list_unordered() might return wrong results.

Steps to Reproduce:
1. Create a bucket and fill it with abort multipart whose quantity is greater than 1100.
2. Issue an unordered listing of the bucket by restful API with allow-unordered=true.

Though we had investigated this problem before (related bug is at https://tracker.ceph.com/issues/38486) but we are still having problems with the function cls_bucket_list_unordered(), cause cls_bucket_list_unordered() might return repeating results when number of abort multipart in index shard exceed num_entries, whose value is usually 1100. This is because when count < num_entries, last_added_entries is not updated. This is really easy to reproduce. Although this is not a big problem but it still turns out that the code is not working as expected.
And cls_bucket_list_unordered might return wrong (partial or repeating) results when object names adorned with upload_id, which will not be seprated by MultipartMetaFilter.filter(), and we might get wrong current_shard, which results in partial or repeating results.Here is a simple case, passing "test.txt1624962343.016714.2~eTgNuuh0R7-LG95cBxKpB-jtDTJ5JUP.1" and "test.txt1624962343.016714" to the rgw_bucket_shard_index function will produce different results, as earlierly discussed at https://github.com/ceph/ceph/pull/26658.To be simple, the filter should be able to return the unadorned object name when passing a marker like [namespace]oid[upload_id][meta_suffix|part_num]


Related issues 2 (0 open2 closed)

Copied to rgw - Backport #53036: pacific: rgw: cls_bucket_list_unordered() might return repeating or partial results when enable index shardResolvedCory SnyderActions
Copied to rgw - Backport #53037: octopus: rgw: cls_bucket_list_unordered() might return repeating or partial results when enable index shardResolvedCory SnyderActions
Actions #1

Updated by 鹏 张 almost 3 years ago

sorry, the condition when last_added_entries is not updated should be count >= num_entries

    if (count < num_entries) {
      marker = last_added_entry = dirent.key; // double assign
      ent_list.emplace_back(std::move(dirent));
      ++count;
    } else {
      *is_truncated = true;
      goto check_updates;
    }
Actions #2

Updated by 鹏 张 almost 3 years ago

when object name adorned with upload_id, we might get wrong current_shard

    // test whether object name is a multipart meta name
    if(! multipart_meta_filter.filter(start.name, key)) { <=== do not stripe upload_id when multipart name doesn't have meta suffix 
      // if multipart_meta_filter fails, must be "regular" (i.e.,
      // unadorned) and the name is the key
      key = start.name;
    }

    // now convert the key (oid) to an rgw_obj_key since that will
    // separate out the namespace, name, and instance
    rgw_obj_key obj_key;
    bool parsed = rgw_obj_key::parse_raw_oid(key, &obj_key); <=== do not stripe upload_id
    if (!parsed) {
      ldout(cct, 0) <<
    "ERROR: RGWRados::cls_bucket_list_unordered received an invalid " 
    "start marker: '" << start << "'" << dendl;
      return -EINVAL;
    } else if (obj_key.name.empty()) {
      // if the name is empty that means the object name came in with
      // a namespace only, and therefore we need to start our scan at
      // the first bucket index shard
      current_shard = 0u;
    } else {
      // so now we have the key used to compute the bucket index shard
      // and can extract the specific shard from it
      current_shard = rgw_bucket_shard_index(obj_key.name, num_shards); <=== here we might get wrong current_shard
    }

Actions #3

Updated by 鹏 张 almost 3 years ago

fixes : http://tracker.ceph.com/issues/49206

when object name adorned with upload_id, we might get wrong current_shard
[...]

Actions #4

Updated by Casey Bodley almost 3 years ago

  • Assignee set to J. Eric Ivancich
  • Tags set to unordered-listing
Actions #6

Updated by Casey Bodley almost 3 years ago

  • Status changed from New to Fix Under Review
Actions #7

Updated by Casey Bodley over 2 years ago

  • Pull request ID set to 42151
Actions #8

Updated by J. Eric Ivancich over 2 years ago

  • Status changed from Fix Under Review to Resolved

鹏 张, Would you determine if this needs a backport to pacific or octopus? I currently have this in the RESOLVED state, but perhaps it needs to have the backports. Thanks.

Actions #9

Updated by J. Eric Ivancich over 2 years ago

  • Status changed from Resolved to Pending Backport
  • Backport set to pacific, octopus
Actions #10

Updated by Backport Bot over 2 years ago

  • Copied to Backport #53036: pacific: rgw: cls_bucket_list_unordered() might return repeating or partial results when enable index shard added
Actions #11

Updated by Backport Bot over 2 years ago

  • Copied to Backport #53037: octopus: rgw: cls_bucket_list_unordered() might return repeating or partial results when enable index shard added
Actions #12

Updated by Backport Bot over 1 year ago

  • Tags changed from unordered-listing to unordered-listing backport_processed
Actions #13

Updated by Konstantin Shalygin over 1 year ago

  • Status changed from Pending Backport to Resolved
  • % Done changed from 0 to 100
  • Tags changed from unordered-listing backport_processed to unordered-listing
Actions

Also available in: Atom PDF