Project

General

Profile

Actions

Bug #48709

open

[RGW] [boto] PUT on versioned bucket fails with NoSuchKey

Added by Mark Kogan over 3 years ago. Updated over 1 year ago.

Status:
Pending Backport
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Tags:
backport_processed
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Description of problem:
'NoSuchKey' error observed on PUT operation on a versioned bucket while running a boto script. 

The boto script creates a versioned bucket with 5 versions of the object, lists and deletes the bucket in a loop.

Version-Release number of selected component (if applicable):
ceph version 14.2.11-89.el8cp

How reproducible:
3/3

Steps to Reproduce:
1. ceph cluster on 4.2 with 3 rgw nodes 
2. script gc_testing_ver_bkt.py is run from rgw node [extensa018]
3. The script gc_testing_ver_bkt.py does the following
    a. creates a bucket [ kvm_gc_ver_bkt_rhcs_1(num_of_iteration)]
    b. enables versionsing on the bucket
    c. creates 20 objects  [ object size- 5M]
    d. creates 5 versions for each object
    e. lists the objects
    f. deletes the versioned objects 
    g. deletes the bucket 
    h creates another bucket [kvm_gc_ver_bkt_rhcs_2] steps a-g are repeated
    P.S: The script repeats the above 100000 times.

4. The script failed with NoSuchKey on the 23th iteration for bucket kvm_gc_ver_bkt_rhcs_23
snippet: 
Traceback (most recent call last):
  File "gc_testing_ver_bkt.py", line 29, in <module>
    key.set_contents_from_filename('classV')
  File "/usr/local/lib/python3.6/site-packages/boto/s3/key.py", line 1378, in set_contents_from_filename
    encrypt_key=encrypt_key)
  File "/usr/local/lib/python3.6/site-packages/boto/s3/key.py", line 1309, in set_contents_from_file
    chunked_transfer=chunked_transfer, size=size)
  File "/usr/local/lib/python3.6/site-packages/boto/s3/key.py", line 762, in send_file
    chunked_transfer=chunked_transfer, size=size)
  File "/usr/local/lib/python3.6/site-packages/boto/s3/key.py", line 963, in _send_file_internal
    query_args=query_args
  File "/usr/local/lib/python3.6/site-packages/boto/s3/connection.py", line 671, in make_request
    retry_handler=retry_handler
  File "/usr/local/lib/python3.6/site-packages/boto/connection.py", line 1071, in make_request
    retry_handler=retry_handler)
  File "/usr/local/lib/python3.6/site-packages/boto/connection.py", line 940, in _mexe
    request.body, request.headers)
  File "/usr/local/lib/python3.6/site-packages/boto/s3/key.py", line 896, in sender
    response.status, response.reason, body)
boto.exception.S3ResponseError: S3ResponseError: 404 Not Found
<?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchKey</Code><BucketName>kvm_gc_ver_bkt_rhcs_23</BucketName><RequestId>tx00000000000000000132a-005fcf69f3-b20f-default</RequestId><HostId>b20f-default-default</HostId></Error>
[root@extensa018 kvm]# 

5. Observed the following in the logs [ rgw node magna055]

2020-12-08 11:56:41.391 7f2a99e6d700  5 bs.init() returned ret=-2
2020-12-08 11:56:41.391 7f2a99e6d700 20 update_olh() target_obj=kvm_gc_ver_bkt_rhcs_23:_:LVPsHPu6-2rx2.De6.qU5.3xFf47Kh4_dairy10 returned -2
2020-12-08 11:56:41.391 7f2a99e6d700 20 get_system_obj_state: rctx=0x7f2b2fef3958 obj=default.rgw.log:pubsub.user.kvm-gc.bucket.kvm_gc_ver_bkt_rhcs_23/73425f4f-9160-4820-8908-1119bafce85e.45589.24 state=0x55f115ba59a0 s->prefetch_data=0
2020-12-08 11:56:41.391 7f2a99e6d700 10 cache get: name=default.rgw.log++pubsub.user.kvm-gc.bucket.kvm_gc_ver_bkt_rhcs_23/73425f4f-9160-4820-8908-1119bafce85e.45589.24 : hit (negative entry)
2020-12-08 11:56:41.457 7f2a99e6d700  2 req 4906 5.489s s3:put_obj completing
2020-12-08 11:56:41.457 7f2a99e6d700  2 req 4906 5.489s s3:put_obj op status=-2
2020-12-08 11:56:41.457 7f2a99e6d700  2 req 4906 5.489s s3:put_obj http status=404
2020-12-08 11:56:41.457 7f2a99e6d700  1 ====== req done req=0x7f2b2fef7680 op status=-2 http_status=404 latency=5.48898s ======
2020-12-08 11:56:41.457 7f2a99e6d700  1 beast: 0x7f2b2fef7680: 10.8.130.218 - - [2020-12-08 11:56:41.0.457772s] "PUT /kvm_gc_ver_bkt_rhcs_23/dairy10 HTTP/1.1" 404 10485989 - "Boto/2.49.0 Python/3.6.8 Linux/4.18.0-240.1.1.el8_3.x86_64" -

6. ceph configuration parameters on the setup
rgw_lc_debug_interval = 600
rgw gc obj min wait = 10
rgw_lc_max_worker = 10
rgw_max_objs_per_shard = 5

Actual results:
boto script fails with 
boto.exception.S3ResponseError: S3ResponseError: 404 Not Found
<?xml version="1.0" encoding="UTF-8"?><Error><Code>NoSuchKey</Code><BucketName>kvm_gc_ver_bkt_rhcs_23</BucketName><RequestId>tx00000000000000000132a-005fcf69f3-b20f-default</RequestId><HostId>b20f-default-default</HostId></Error>

Expected results:
script should not fail.
Actions #1

Updated by Mark Kogan over 3 years ago

  • Pull request ID set to 38705
Actions #2

Updated by Mark Kogan over 3 years ago

Additional data collection will follow as discussed...

Actions #3

Updated by Mark Kogan about 2 years ago

with the following upstream fixes:

1. https://github.com/ceph/ceph/pull/45345 -- cls/rgw: rgw_dir_suggest_changes detects race with completion
2. https://github.com/ceph/ceph/pull/45300 -- rgw: Update "CEPH_RGW_DIR_SUGGEST_LOG_OP" for remove entries

this BZ no longer reproduces on master.

Actions #4

Updated by Mark Kogan about 2 years ago

  • Status changed from In Progress to Pending Backport
Actions #5

Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed
Actions

Also available in: Atom PDF