Project

General

Profile

Actions

Bug #15682

closed

inefficient bucket listing with max-keys URL parameter

Added by Abhishek Varshney almost 8 years ago. Updated almost 8 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (dev)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

I came across an issue in rgw where bucket listing turns out to be
quite slow when a low value is specified to the URL parameter
<max-keys>. For example, the s3a hadoop connector specifies <max-keys>
parameter to be 1 for a certain operation [1]. The expectation of the
client here is to get result set with 1 value. Radosgw, in turn,
percolates this value down to rados and fetches a key one by one in
RGWRados::Bucket::List::list_objects function before checking for
delimiter etc. This turns out to be highly inefficient and thus the
client faces time-outs.

It would have been better if a config option could be provided to
avoid such issues, like, a minimum readahead value for listing objects
from rados.

I have raised a PR for the above mentioned fix. Please review : https://github.com/ceph/ceph/pull/8756

[1] https://github.com/apache/hadoop/blob/trunk/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3AFileSystem.java#L1012

Actions #1

Updated by Sage Weil almost 8 years ago

  • Project changed from Ceph to rgw
  • Category deleted (22)
Actions #2

Updated by Orit Wasserman almost 8 years ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF