Project

General

Profile

Actions

Bug #45432

open

fastfail of client requests for homeless session scenario

Added by Or Friedmann almost 4 years ago. Updated about 2 years ago.

Status:
Fix Under Review
Priority:
Normal
Assignee:
Or Friedmann
Target version:
% Done:

0%

Source:
Tags:
Backport:
octopus pacific
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

[Problem] As per the current radosgw behaviour, for any client request, a blocking call is made to the osd to fetch the object. But in case of homeless session i.e when no osd's for that object is available to serve data, the rgw thread hangs indefinitely waiting for an osd to come active. If multiple such requests come, all the radosgw thread gets exhausted, each waiting indefinitely for the osd to come back. This creates a complete service unavailability. Even though there are many other active osd to serve valid client requests, the rgw threads are simply not free to take incoming request.

[Solution] There is no point in indefinitely waiting when all the osd's for an object are down. It is appropriate to cancel the op in such scenarios so that the radosgw thread is free to take more incoming valid requests. Also, this tunable should be configurable from the ceph.conf as to enable or disable this feature.

Actions

Also available in: Atom PDF