Project

General

Profile

Actions

Bug #64203

open

RGW S3: list bucket results in a 500 Error when object-lock is enabled

Added by Ying Wang 3 months ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Target version:
% Done:

0%

Source:
Tags:
listing
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

We are running on 16.2.14
We have a user that got an issue with RGW S3:
After a series of operations, the 'list bucket' operation returned a 500 error code.

The specific operations are as follows:

1. Creating a bucket with object-lock enabled
PUT /new-bucket-771bc026 HTTP/1.1
Host: 127.0.0.1:20005
x-amz-bucket-object-lock-enabled: true
Date: Wed, 20 Dec 2023 16:24:11 +0000
Authorization: AWS admin:

2. Performing a minimum of 17 'put object' operations on this new bucket, with the same object ID used in each operation, will trigger the creation of multiple version for the same object.
PUT /new-bucket-771bc026/h HTTP/1.1
Host: 127.0.0.1:20005
Content-Length: 0
Date: Wed, 20 Dec 2023 16:24:11 +0000
Authorization: AWS admin:

After these operations, the content of the bucket is shown is the following file.

3. Executing the 'list bucket' operation with allow-unordered=true and maxk-keys=1 results in a 500 error:
GET /new-bucket-771bc026?allow-unordered=true&max-keys=1 HTTP/1.1
Host: 127.0.0.1:20005
Date: Wed, 20 Dec 2023 16:24:11 +0000
Authorization: AWS admin:

Return results:
@HTTP/1.1 500 Internal Server Error
Content-Length: 252
x-amz-request-id: tx00000314b4680c4d71a26-0065ae6027-13723a-default
Accept-Ranges: bytes
Content-Type: application/xml
Date: Mon, 22 Jan 2024 12:31:35 GMT

<Error><Code>UnknownError</Code><Message></Message><BucketName>new-bucket-771bc026</BucketName><RequestId>tx00000314b4680c4d71a26-0065ae6027-13723a-default</RequestId><HostId>13723a-default-default</HostId></Error>2024-01-22T20:31:35.288+0800 7fbe2fec3700 1 ====== req done req=0x7fbe2febafe0 op status=-27 http_status=500 latency=0.024000945s ======@

We attempted to review the Ceph-related code and found that in RGW when set to 'allow-unordered', there is a prefetch strategy (list_objects_unordered) that reads objects from the bucket in a quantity twich that of max-key. In the example above, we pushed the same object multiple times, resulting in only one visible object and 16 invisible objects in the bucket. The OSD attempts to read 8 times in the code (src/cls/rgw/cls_rgw.cc::rgw_bucket_list() max_attempts = 8), with a quantity of 2 per read (max-key * 2). This causes the OSD to read only one visible object after exhausting all attempts, falling short of the expected perfetch of two objects, resulting in RGWBIAdvanceAndRetryError. Ultimately, RGW ultimately returns an error with status code 500.
In fact, the request should return the one visible object as expected, sine the specified max-key in the request is 1.

We are not certain whether the issue we encountered is the same as issue 62256; however, here we have removed some other interfering content, making the problem consistently reproducible.


Files

微信图片_20240129113418-imageonline.co-merged.png (675 KB) 微信图片_20240129113418-imageonline.co-merged.png The figure shows the content of the bucket after the "Put object" operation (step 2). Ying Wang, 01/29/2024 03:39 AM

Related issues 1 (0 open1 closed)

Related to rgw - Bug #62256: rgw: Bucket listing hangs against object with 11111 versionsCan't reproduceJ. Eric Ivancich

Actions
Actions #1

Updated by Casey Bodley 3 months ago

  • Assignee set to J. Eric Ivancich
  • Tags set to listing
Actions #2

Updated by Casey Bodley 3 months ago

  • Related to Bug #62256: rgw: Bucket listing hangs against object with 11111 versions added
Actions

Also available in: Atom PDF