Project

General

Profile

Actions

Bug #22790

closed

Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabled

Added by Vikhyat Umrao about 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
Support
Tags:
Backport:
luminous mimic
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabled
RHT Bug - https://bugzilla.redhat.com/show_bug.cgi?id=1537737


Files

resend_part.py (905 Bytes) resend_part.py python script to resend a part during a multipart upload Casey Bodley, 01/24/2018 09:36 PM

Related issues 2 (0 open2 closed)

Copied to rgw - Backport #25078: mimic: Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabledResolvedMatt BenjaminActions
Copied to rgw - Backport #25079: luminous: Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabledResolvedMatt BenjaminActions
Actions #1

Updated by Casey Bodley about 6 years ago

  • Status changed from New to 12
  • Backport set to luminous

the relevant block of code in RGWPutObj::execute():

    /* do we need this operation to be synchronous? if we're dealing with an object with immutable
     * head, e.g., multipart object we need to make sure we're the first one writing to this object
     */
    bool need_to_wait = (ofs == 0) && multipart;

    bufferlist orig_data;

    if (need_to_wait) {
      orig_data = data;
    }

    op_ret = put_data_and_throttle(filter, data, ofs, need_to_wait);
    if (op_ret < 0) {
      if (!need_to_wait || op_ret != -EEXIST) {
        ldout(s->cct, 20) << "processor->thottle_data() returned ret=" 
                          << op_ret << dendl;
        goto done;
      }
      /* need_to_wait == true and op_ret == -EEXIST */
      ldout(s->cct, 5) << "NOTICE: processor->throttle_data() returned -EEXIST, need to restart write" << dendl;

for uncompressed objects, the first rados write will always start at ofs=0. when compression is enabled, we fit more than one chunk into the first rados write and will see an ofs > 0. this leads to need_to_wait=false, where the EEXIST error is treated as fatal instead of restarting the write to a different object

Actions #2

Updated by Casey Bodley about 6 years ago

attached a python script to reproduce this failure:

~/ceph/build $ ./resend_part.py cosbench-0.4.2.c4.tar                                                                                                            
Traceback (most recent call last):          
  File "./resend_part.py", line 34, in <module>                                         
    p = part.upload(Body=body)              
  File "/usr/lib/python2.7/site-packages/boto3/resources/factory.py", line 520, in do_action                                                                                     
    response = action(self, *args, **kwargs)                                            
  File "/usr/lib/python2.7/site-packages/boto3/resources/action.py", line 83, in __call__                                                                                        
    response = getattr(parent.meta.client, operation_name)(**params)                    
  File "/usr/lib/python2.7/site-packages/botocore/client.py", line 317, in _api_call    
    return self._make_api_call(operation_name, kwargs)                                  
  File "/usr/lib/python2.7/site-packages/botocore/client.py", line 615, in _make_api_call                                                                                        
    raise error_class(parsed_response, operation_name)                                  
botocore.errorfactory.BucketAlreadyExists: An error occurred (BucketAlreadyExists) when calling the UploadPart operation: Unknown
Actions #3

Updated by Matt Benjamin almost 6 years ago

  • Status changed from 12 to Fix Under Review
  • Assignee set to Matt Benjamin
  • Backport changed from luminous to luminous mimic
Actions #4

Updated by Matt Benjamin over 5 years ago

  • Copied to Backport #25078: mimic: Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabled added
Actions #5

Updated by Matt Benjamin over 5 years ago

  • Copied to Backport #25079: luminous: Intermittent http_status=409 with op status=-17 on ceph rgw with compression enabled added
Actions #6

Updated by Nathan Cutler over 5 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #7

Updated by Nathan Cutler over 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF