Project

General

Profile

Actions

Bug #50756

open

Haproxy with keepalive crashes RGW when object name lenght > 1024 bytes

Added by Aleksandr Rudenko almost 3 years ago. Updated almost 3 years ago.

Status:
Need More Info
Priority:
Normal
Assignee:
Or Friedmann
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

AWS S3 specification required object names lenght <= 1024 bytes and RGW follows this spec.

But we have problem if RGW is behind haproxy (http mode with keepalive). In this case request with name >1024 bytes crashes RGW for ~30 secs. During this 30 secs RGW can't respond on other requests.

Example:

>>> import boto3
>>> import botocore.config
>>> config = botocore.config.Config(s3={"addressing_style": "path"});
>>>
>>>
>>>
>>> session = boto3.Session(aws_access_key_id="access", aws_secret_access_key="secret")
>>> client = session.client("s3", use_ssl=True, verify=False, endpoint_url="https://s3.domain.com", config=config)

>>> client.put_object(Bucket="test1", Key="a" * 2000, Body=b"test" 

Response is normal:

botocore.exceptions.ClientError: An error occurred (InvalidObjectName) when calling the PutObject operation: Unknown

But next another request will fail:

>>> client.list_buckets()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/Users/ruden/envs/boto3/.venv/lib/python2.7/site-packages/botocore/client.py", line 357, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Users/ruden/envs/boto3/.venv/lib/python2.7/site-packages/botocore/client.py", line 676, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred () when calling the ListBuckets operation:

In RGW logs (debug_rgw 20) we can't see any errors but we can see that RGW will complete first request only after some time (10-30 secs).

Haproxy default options (for all RGW frontends):

defaults
    mode http
    log global

    option httplog
    option http-ignore-probes
    option redispatch

    retries 1
    timeout queue 30s
    timeout connect 10s
    timeout client 30s
    timeout server 360s
    timeout http-request 5s
    timeout http-keep-alive 10s

We can't see this problem with:

option forceclose

Actions

Also available in: Atom PDF