Project

General

Profile

Actions

Bug #59563

open

busy looping caused by remote request ETIMEDOUT handling logic

Added by Ilya Dryomov about 1 year ago.

Status:
In Progress
Priority:
Normal
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This originated in the context of "rbd feature disable" operation getting proxied, ETIMEDOUT error returned by DisableFeaturesRequest and "rbd feature disable" command getting stuck in a busy loop, see https://tracker.ceph.com/issues/58740. While DisableFeaturesRequest is getting changed to avoid generating ETIMEDOUT error in that particular case in https://github.com/ceph/ceph/pull/50593, a general issue we have is that an operation handler is allowed to return an arbitrary error code. If it happens to return ETIMEDOUT (not because of a notification-related issue but just on it's own) while being executed on behalf of some other client (i.e. a "remote" request), the client gets into a busy loop.

If the operation handler is executed locally and happens to return ETIMEDOUT, the error is propagated to the API immediately. If the same operation handler is executed remotely and returns ETIMEDOUT exactly the same way, the requestor gets into a busy loop and the API appears to hang. This applies to all operations that can be proxied.

"Remote" requests should be made to behave exactly the same as a "local" requests as far as error propagation goes; operation proxying must be completely transparent to the user.


Related issues 1 (0 open1 closed)

Related to rbd - Bug #58740: "rbd feature disable" remote request hangs when proxied to rbd-nbdResolvedPrasanna Kumar Kalever

Actions
Actions #1

Updated by Ilya Dryomov about 1 year ago

  • Related to Bug #58740: "rbd feature disable" remote request hangs when proxied to rbd-nbd added
Actions

Also available in: Atom PDF