Project

General

Profile

Bug #22790

Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabled

Added by Vikhyat Umrao 12 months ago. Updated 5 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
Start date:
01/24/2018
Due date:
% Done:

0%

Source:
Support
Tags:
Backport:
luminous mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabled
RHT Bug - https://bugzilla.redhat.com/show_bug.cgi?id=1537737

resend_part.py View - python script to resend a part during a multipart upload (905 Bytes) Casey Bodley, 01/24/2018 09:36 PM


Related issues

Copied to rgw - Backport #25078: mimic: Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabled Resolved
Copied to rgw - Backport #25079: luminous: Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabled Resolved

History

#1 Updated by Casey Bodley 12 months ago

  • Status changed from New to Verified
  • Backport set to luminous

the relevant block of code in RGWPutObj::execute():

    /* do we need this operation to be synchronous? if we're dealing with an object with immutable
     * head, e.g., multipart object we need to make sure we're the first one writing to this object
     */
    bool need_to_wait = (ofs == 0) && multipart;

    bufferlist orig_data;

    if (need_to_wait) {
      orig_data = data;
    }

    op_ret = put_data_and_throttle(filter, data, ofs, need_to_wait);
    if (op_ret < 0) {
      if (!need_to_wait || op_ret != -EEXIST) {
        ldout(s->cct, 20) << "processor->thottle_data() returned ret=" 
                          << op_ret << dendl;
        goto done;
      }
      /* need_to_wait == true and op_ret == -EEXIST */
      ldout(s->cct, 5) << "NOTICE: processor->throttle_data() returned -EEXIST, need to restart write" << dendl;

for uncompressed objects, the first rados write will always start at ofs=0. when compression is enabled, we fit more than one chunk into the first rados write and will see an ofs > 0. this leads to need_to_wait=false, where the EEXIST error is treated as fatal instead of restarting the write to a different object

#2 Updated by Casey Bodley 12 months ago

attached a python script to reproduce this failure:

~/ceph/build $ ./resend_part.py cosbench-0.4.2.c4.tar                                                                                                            
Traceback (most recent call last):          
  File "./resend_part.py", line 34, in <module>                                         
    p = part.upload(Body=body)              
  File "/usr/lib/python2.7/site-packages/boto3/resources/factory.py", line 520, in do_action                                                                                     
    response = action(self, *args, **kwargs)                                            
  File "/usr/lib/python2.7/site-packages/boto3/resources/action.py", line 83, in __call__                                                                                        
    response = getattr(parent.meta.client, operation_name)(**params)                    
  File "/usr/lib/python2.7/site-packages/botocore/client.py", line 317, in _api_call    
    return self._make_api_call(operation_name, kwargs)                                  
  File "/usr/lib/python2.7/site-packages/botocore/client.py", line 615, in _make_api_call                                                                                        
    raise error_class(parsed_response, operation_name)                                  
botocore.errorfactory.BucketAlreadyExists: An error occurred (BucketAlreadyExists) when calling the UploadPart operation: Unknown

#3 Updated by Matt Benjamin 6 months ago

  • Status changed from Verified to Need Review
  • Assignee set to Matt Benjamin
  • Backport changed from luminous to luminous mimic

#4 Updated by Matt Benjamin 6 months ago

  • Copied to Backport #25078: mimic: Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabled added

#5 Updated by Matt Benjamin 6 months ago

  • Copied to Backport #25079: luminous: Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabled added

#6 Updated by Nathan Cutler 6 months ago

  • Status changed from Need Review to Pending Backport

#7 Updated by Nathan Cutler 5 months ago

  • Status changed from Pending Backport to Resolved

Also available in: Atom PDF