Project

General

Profile

Actions

Bug #48115

closed

rgw stops responding correctly, error log full of messages: ceph s3:get_obj Scheduling request failed with -2218

Added by Matthew Darwin over 3 years ago. Updated over 3 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

rgw stops responding correctly, error log full of messages: ceph s3:get_obj Scheduling request failed with -2218.

After restarting rgw things go back to normal.

Using package debian: 15.2.5-1~bpo10+1

Example log entry:

2020-11-03T10:32:33.799+0000 7f129a267700 1 ====== starting new request req=0x7f11d90c6680 =====
2020-11-03T10:32:33.799+0000 7f129a267700 0 req 5594040 0s s3:get_obj Scheduling request failed with 2218
2020-11-03T10:32:33.799+0000 7f129a267700 1 op
>ERRORHANDLER: err_no=-2218 new_err_no=-2218
2020-11-03T10:32:33.799+0000 7f129a267700 1 ====== req done req=0x7f11d90c6680 op status=0 http_status=503 latency=0s ======
2020-11-03T10:32:33.799+0000 7f129a267700 1 beast: 0x7f11d90c6680: XXX.XXX.XXX.XXX - - [2020-11-03T10:32:33.799466+0000] "HEAD /xxxxxxxxxxx HTTP/1.1" 503 0 - "aws-sdk-go/1.25.43 (go1.14.2; linux; amd64)" -

impact to all URLs, all connected hosts. Other rgw running as part of the cluster were operating normally.

I'm not sure what else is needed to debug this issue. Please advise.

Some of the OSD were unstable (offline briefly) around the time of this issue. Not sure if it is related.

Actions

Also available in: Atom PDF