Fix #62712: pybind/mgr/volumes: implement EAGAIN logic for clearing request queue when under load - CephFS - Ceph

Actions

Copy link

Fix #62712

open

pybind/mgr/volumes: implement EAGAIN logic for clearing request queue when under load

Added by Patrick Donnelly 8 months ago. Updated 8 months ago.

Status:

New

Priority:

Normal

Assignee:

Neeraj Pratap Singh

Category:

Administration/Usability

Target version:

Ceph - v19.0.0

% Done:

Source:

Development

Tags:

Backport:

reef,quincy

Reviewed:

Affected Versions:

ceph-qa-suite:

Component(FS):

mgr/volumes

Labels (FS):

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Even with the recent changes to the ceph-mgr (#51177) to have a separate finisher thread for each module, the requests queued in the volumes plugin finisher thread may still occupy all the slots for the ceph-mgr throttles. This will eventually result in a similar DoS where the volumes plugin's slowness or bugs may result in larger-scale cluster issues.

A possible way to resolve this in a durable way is for the DaemonServer to skip queueing the command if the mod_finisher queue is past some threshold. Instead, return `EAGAIN` as an error from the module.

Beyond this being a fairly simple structural change, it'd be important to check ceph-csi is equipped to handle these types of retry errors from the storage backend.

Related issues 1 (1 open — 0 closed)

Actions

Copy link

Updated by Venky Shankar 8 months ago

Patrick Donnelly wrote:

Even with the recent changes to the ceph-mgr (#51177) to have a separate finisher thread for each module, the requests queued in the volumes plugin finisher thread may still occupy all the slots for the ceph-mgr throttles. This will eventually result in a similar DoS where the volumes plugin's slowness or bugs may result in larger-scale cluster issues.

A possible way to resolve this in a durable way is for the DaemonServer to skip queueing the command if the mod_finisher queue is past some threshold. Instead, return `EAGAIN` as an error from the module.

Beyond this being a fairly simple structural change, it'd be important to check ceph-csi is equipped to handle these types of retry errors from the storage backend.

It likely requires ceph-csi catching EAGAIN and retrying at a later point. A similar change (in mgr/volumes however) is https://tracker.ceph.com/issues/59714 - if all the cloner threads are busy cloning, then the snapshot clone request is denied with errno -EAGAIN. ceph-csi handles this special errno and retries at a later point.

Actions

Copy link

Updated by Patrick Donnelly 8 months ago

Related to Feature #59714: mgr/volumes: Support to reject CephFS clones if cloner threads are not available added

Actions

Copy link

Updated by Venky Shankar 8 months ago

Category set to Administration/Usability
Assignee set to Neeraj Pratap Singh

Actions

Copy link

Updated by Venky Shankar 8 months ago

Patrick,

Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).

Actions

Copy link

Updated by Patrick Donnelly 8 months ago

Venky Shankar wrote:

Patrick,

Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).

I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.

Actions

Copy link

Updated by Venky Shankar 8 months ago

Patrick Donnelly wrote:

Venky Shankar wrote:

Patrick,

Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).

I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.

Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?

Actions

Copy link

Updated by Neeraj Pratap Singh 8 months ago

Category deleted (~~Administration/Usability~~)
Assignee deleted (~~Neeraj Pratap Singh~~)

Venky Shankar wrote:

Patrick Donnelly wrote:

Venky Shankar wrote:

Patrick,

Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).

I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.

Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?

I guess we are waiting here on Patrick, or we have reached to any conclusion?

Actions

Copy link

Updated by Neeraj Pratap Singh 8 months ago

Category set to Administration/Usability
Assignee set to Neeraj Pratap Singh

Actions

Copy link

Updated by Venky Shankar 8 months ago

Neeraj Pratap Singh wrote:

Venky Shankar wrote:

Patrick Donnelly wrote:

Venky Shankar wrote:

Patrick,

Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).

I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.

Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?

I guess we are waiting here on Patrick, or we have reached to any conclusion?

IMO the way forward is clear -- the change needs to be generic for all plugin finishers (we have one per plugin now). We need to decide on a sensible (default) threshold for the queue len.

Actions

Copy link

#10

Updated by Patrick Donnelly 8 months ago

Venky Shankar wrote:

Neeraj Pratap Singh wrote:

Venky Shankar wrote:

Patrick Donnelly wrote:

Venky Shankar wrote:

Patrick,

Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).

I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.

Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?

I guess we are waiting here on Patrick, or we have reached to any conclusion?

IMO the way forward is clear -- the change needs to be generic for all plugin finishers (we have one per plugin now). We need to decide on a sensible (default) threshold for the queue len.

It can be an opt-in behavior for modules. Have the module call some method to set a field in its ActivePyModule to maintain a limit on its request queue (as described in this ticket).

Actions

Copy link

#11

Updated by Neeraj Pratap Singh 8 months ago

Patrick Donnelly wrote:

Venky Shankar wrote:

Neeraj Pratap Singh wrote:

Venky Shankar wrote:

Patrick Donnelly wrote:

Venky Shankar wrote:

Patrick,

Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).

I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.

Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?

I guess we are waiting here on Patrick, or we have reached to any conclusion?

IMO the way forward is clear -- the change needs to be generic for all plugin finishers (we have one per plugin now). We need to decide on a sensible (default) threshold for the queue len.

It can be an opt-in behavior for modules. Have the module call some method to set a field in its ActivePyModule to maintain a limit on its request queue (as described in this ticket).

Just for clarity, do you mean to say that we should be having a separate method for setting the limit value for the modules' request queue?

Actions

Copy link

#12

Updated by Patrick Donnelly 8 months ago

Neeraj Pratap Singh wrote:

Patrick Donnelly wrote:

Venky Shankar wrote:

Neeraj Pratap Singh wrote:

Venky Shankar wrote:

Patrick Donnelly wrote:

Venky Shankar wrote:

Patrick,

Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).

I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.

Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?

I guess we are waiting here on Patrick, or we have reached to any conclusion?

IMO the way forward is clear -- the change needs to be generic for all plugin finishers (we have one per plugin now). We need to decide on a sensible (default) threshold for the queue len.

It can be an opt-in behavior for modules. Have the module call some method to set a field in its ActivePyModule to maintain a limit on its request queue (as described in this ticket).

Just for clarity, do you mean to say that we should be having a separate method for setting the limit value for the modules' request queue?

That would make sense to me.

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » CephFS

Custom queries

Fix #62712

pybind/mgr/volumes: implement EAGAIN logic for clearing request queue when under load

Updated by Venky Shankar 8 months ago

Updated by Patrick Donnelly 8 months ago

Updated by Venky Shankar 8 months ago

Updated by Venky Shankar 8 months ago

Updated by Patrick Donnelly 8 months ago

Updated by Venky Shankar 8 months ago

Updated by Neeraj Pratap Singh 8 months ago

Updated by Neeraj Pratap Singh 8 months ago

Updated by Venky Shankar 8 months ago

Updated by Patrick Donnelly 8 months ago

Updated by Neeraj Pratap Singh 8 months ago

Updated by Patrick Donnelly 8 months ago