Project

General

Profile

Actions

Bug #5245

closed

Frequent 500s from radosgw

Added by Jiri Brunclik almost 11 years ago. Updated over 10 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

I have roughly 30 clients talking simultaneously to radosgw over 1Gbps link. I use boto library on the client side.

Frequently I get error 500 when I try to fetch files from radosgw. For now I implemented retry logic in my code, but obviously that's not the right solution :).

The servers are running Apache, but I also tried nginx and it was showing the same behavior.

This is what I see on the client:

IncompleteRead: IncompleteRead(100655630 bytes read, 4201971 more expected)

This is what shows up in Apache logs:

1.2.3.4 - - [04/Jun/2013:11:26:52 +0200] "GET /foo/bar HTTP/1.1" 500 100655630 "-" "Boto/2.2.2 (linux2)" 
[Tue Jun 04 11:26:52 2013] [error] [client 1.2.3.4] (4)Interrupted system call: FastCGI: comm with server "/var/www/radosgw" aborted: select() failed
[Tue Jun 04 11:26:52 2013] [error] [client 1.2.3.4] Handler for fastcgi-script returned invalid result code 1

And finally this is an excerpt from radosgw debug log:

7f517fff7700  0 NOTICE: failed to send response to client
7f517fff7700  0 ERROR: s->cio->print() returned err=-1
7f517fff7700  0 ERROR: s->cio->print() returned err=-1
7f517fff7700  0 ERROR: s->cio->print() returned err=-1
7f517fff7700  0 ERROR: s->cio->print() returned err=-1
7f517fff7700  2 req 6:9.736544:s3:GET /foo/bar:get_obj:http status=403
7f5182ffd700 20 rados->read r=0 bl.length=524288
7f517fff7700  1 ====== req done req=0x1998af0 http_status=403 ======

Installed packages:

ii  librados2                           0.56.6-1~bpo60+1             RADOS distributed object store client library
ii  radosgw                             0.56.6-1~bpo60+1             REST gateway for RADOS distributed object store
ii  ceph                                0.56.6-1~bpo60+1             distributed storage and file system

Please advise how to handle this.

Thanks,

Jiri

Actions

Also available in: Atom PDF