Project

General

Profile

Actions

Bug #62411

open

RGWLC::RGWDeleteLC() failed to set attrs on bucket=test-client.0-jja3ud4g2xqpxdd-130 returned err=-125

Added by Casey Bodley 9 months ago. Updated 9 months ago.

Status:
Triaged
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
lifecycle cache
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

from http://qa-proxy.ceph.com/teuthology/prsrivas-2023-08-10_03:42:21-rgw-wip-rgw-61916-test-failures-distro-default-smithi/7364436/teuthology.log:

2023-08-10T04:43:10.746 INFO:teuthology.orchestra.run.smithi134.stdout:s3tests_boto3/functional/test_s3.py::test_bucket_delete_nonempty PASSED  [ 36%]
2023-08-10T04:43:12.033 INFO:tasks.rgw.client.0.smithi134.stdout:2023-08-10T04:43:12.029+0000 4fc27640 -1 req 11380411328366689377 0.097997732s s3:delete_bucket ERROR: could not remove bucket test-client.0-jja3ud4g2xqpxdd-130
2023-08-10T04:43:12.119 INFO:teuthology.orchestra.run.smithi134.stdout:s3tests_boto3/functional/test_s3.py::test_bucket_concurrent_set_canned_acl PASSED [ 36%]
2023-08-10T04:43:12.274 INFO:teuthology.orchestra.run.smithi134.stdout:s3tests_boto3/functional/test_s3.py::test_object_write_to_nonexist_bucket PASSED [ 36%]
2023-08-10T04:43:12.275 INFO:teuthology.orchestra.run.smithi134.stdout:s3tests_boto3/functional/test_s3.py::test_object_write_to_nonexist_bucket ERROR [ 36%]
2023-08-10T04:43:12.755 INFO:teuthology.orchestra.run.smithi134.stdout:s3tests_boto3/functional/test_s3.py::test_object_write_with_chunked_transfer_encoding PASSED [ 36%]
2023-08-10T04:43:12.755 INFO:teuthology.orchestra.run.smithi134.stdout:s3tests_boto3/functional/test_s3.py::test_object_write_with_chunked_transfer_encoding ERROR [ 36%]
2023-08-10T04:43:13.191 INFO:teuthology.orchestra.run.smithi134.stdout:s3tests_boto3/functional/test_s3.py::test_bucket_create_delete PASSED    [ 36%]
2023-08-10T04:43:13.191 INFO:teuthology.orchestra.run.smithi134.stdout:s3tests_boto3/functional/test_s3.py::test_bucket_create_delete ERROR     [ 36%]

after that could not remove bucket test-client.0-jja3ud4g2xqpxdd-130 error, every following test fails to clean up with NoSuchBucket when calling nuke_bucket(test-client.0-jja3ud4g2xqpxdd-130)

rgw logs adds some context: http://qa-proxy.ceph.com/teuthology/prsrivas-2023-08-10_03:42:21-rgw-wip-rgw-61916-test-failures-distro-default-smithi/7364436/remote/smithi134/log/rgw.ceph.client.0.log.gz

2023-08-10T04:43:11.993+0000 b4d45640  0 lifecycle: RGWLC::RGWDeleteLC() failed to set attrs on bucket=test-client.0-jja3ud4g2xqpxdd-130 returned err=-125
2023-08-10T04:43:12.018+0000 df59a640 10 req 11380411328366689377 0.086997993s s3:delete_bucket cache get: name=default.rgw.meta+root+test-client.0-jja3ud4g2xqpxdd-130 : hit (requested=0x11, cached=0x17)
2023-08-10T04:43:12.018+0000 df59a640 10 req 11380411328366689377 0.086997993s s3:delete_bucket removing default.rgw.meta+root+test-client.0-jja3ud4g2xqpxdd-130 from cache
2023-08-10T04:43:12.018+0000 df59a640 10 req 11380411328366689377 0.086997993s s3:delete_bucket distributing notification oid=default.rgw.control:notify.7 cni=[op: 1, obj: default.rgw.meta:root:test-client.0-jja3ud4g2xqpxdd-130, ofs0, ns]
2023-08-10T04:43:12.020+0000 16281640 10 rgw watcher librados: RGWWatcher::handle_notify()  notify_id 1073741824311 cookie 693002112 notifier 4614 bl.length()=177
2023-08-10T04:43:12.024+0000 12862c640 10 req 11380411328366689377 0.092997849s s3:delete_bucket distributing notification oid=default.rgw.control:notify.7 cni=[op: 1, obj: default.rgw.meta:root:.bucket.meta.test-client.0-jja3ud4g2xqpxdd-130:c4205a4e-e47c-412a-a59c-96762e734842.4620.272, ofs0, ns]
2023-08-10T04:43:12.025+0000 16a82640 10 rgw watcher librados: RGWWatcher::handle_notify()  notify_id 1073741824312 cookie 693002112 notifier 4614 bl.length()=236
2023-08-10T04:43:12.029+0000 4fc27640 -1 req 11380411328366689377 0.097997732s s3:delete_bucket ERROR: could not remove bucket test-client.0-jja3ud4g2xqpxdd-130

when we hit that RGWLC::RGWDeleteLC() error, i think we've already deleted the bucket entrypoint, but never remove the bucket from the user's user.buckets list. that causes the s3test cleanup code to see the bucket in ListBuckets, but be unable to ListObjectVersions to clean it up

the -125 (ECANCELED) error would result from a race, which seems to be exercised by test_bucket_concurrent_set_canned_acl. during bucket deletion, we shouldn't need to update the bucket metadata; we should only be removing the bucket's entry from the lc list

Actions

Also available in: Atom PDF