Fix #62712
openpybind/mgr/volumes: implement EAGAIN logic for clearing request queue when under load
0%
Description
Even with the recent changes to the ceph-mgr (#51177) to have a separate finisher thread for each module, the requests queued in the volumes plugin finisher thread may still occupy all the slots for the ceph-mgr throttles. This will eventually result in a similar DoS where the volumes plugin's slowness or bugs may result in larger-scale cluster issues.
A possible way to resolve this in a durable way is for the DaemonServer to skip queueing the command if the mod_finisher queue is past some threshold. Instead, return `EAGAIN` as an error from the module.
Beyond this being a fairly simple structural change, it'd be important to check ceph-csi is equipped to handle these types of retry errors from the storage backend.
Updated by Venky Shankar 8 months ago
Patrick Donnelly wrote:
Even with the recent changes to the ceph-mgr (#51177) to have a separate finisher thread for each module, the requests queued in the volumes plugin finisher thread may still occupy all the slots for the ceph-mgr throttles. This will eventually result in a similar DoS where the volumes plugin's slowness or bugs may result in larger-scale cluster issues.
A possible way to resolve this in a durable way is for the DaemonServer to skip queueing the command if the mod_finisher queue is past some threshold. Instead, return `EAGAIN` as an error from the module.
Beyond this being a fairly simple structural change, it'd be important to check ceph-csi is equipped to handle these types of retry errors from the storage backend.
It likely requires ceph-csi catching EAGAIN and retrying at a later point. A similar change (in mgr/volumes however) is https://tracker.ceph.com/issues/59714 - if all the cloner threads are busy cloning, then the snapshot clone request is denied with errno -EAGAIN. ceph-csi handles this special errno and retries at a later point.
Updated by Patrick Donnelly 8 months ago
- Related to Feature #59714: mgr/volumes: Support to reject CephFS clones if cloner threads are not available added
Updated by Venky Shankar 8 months ago
- Category set to Administration/Usability
- Assignee set to Neeraj Pratap Singh
Updated by Venky Shankar 8 months ago
Patrick,
Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).
Updated by Patrick Donnelly 8 months ago
Venky Shankar wrote:
Patrick,
Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).
I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.
Updated by Venky Shankar 8 months ago
Patrick Donnelly wrote:
Venky Shankar wrote:
Patrick,
Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).
I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.
Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?
Updated by Neeraj Pratap Singh 8 months ago
- Category deleted (
Administration/Usability) - Assignee deleted (
Neeraj Pratap Singh)
Venky Shankar wrote:
Patrick Donnelly wrote:
Venky Shankar wrote:
Patrick,
Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).
I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.
Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?
I guess we are waiting here on Patrick, or we have reached to any conclusion?
Updated by Neeraj Pratap Singh 8 months ago
- Category set to Administration/Usability
- Assignee set to Neeraj Pratap Singh
Updated by Venky Shankar 8 months ago
Neeraj Pratap Singh wrote:
Venky Shankar wrote:
Patrick Donnelly wrote:
Venky Shankar wrote:
Patrick,
Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).
I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.
Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?
I guess we are waiting here on Patrick, or we have reached to any conclusion?
IMO the way forward is clear -- the change needs to be generic for all plugin finishers (we have one per plugin now). We need to decide on a sensible (default) threshold for the queue len.
Updated by Patrick Donnelly 8 months ago
Venky Shankar wrote:
Neeraj Pratap Singh wrote:
Venky Shankar wrote:
Patrick Donnelly wrote:
Venky Shankar wrote:
Patrick,
Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).
I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.
Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?
I guess we are waiting here on Patrick, or we have reached to any conclusion?
IMO the way forward is clear -- the change needs to be generic for all plugin finishers (we have one per plugin now). We need to decide on a sensible (default) threshold for the queue len.
It can be an opt-in behavior for modules. Have the module call some method to set a field in its ActivePyModule to maintain a limit on its request queue (as described in this ticket).
Updated by Neeraj Pratap Singh 8 months ago
Patrick Donnelly wrote:
Venky Shankar wrote:
Neeraj Pratap Singh wrote:
Venky Shankar wrote:
Patrick Donnelly wrote:
Venky Shankar wrote:
Patrick,
Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).
I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.
Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?
I guess we are waiting here on Patrick, or we have reached to any conclusion?
IMO the way forward is clear -- the change needs to be generic for all plugin finishers (we have one per plugin now). We need to decide on a sensible (default) threshold for the queue len.
It can be an opt-in behavior for modules. Have the module call some method to set a field in its ActivePyModule to maintain a limit on its request queue (as described in this ticket).
Just for clarity, do you mean to say that we should be having a separate method for setting the limit value for the modules' request queue?
Updated by Patrick Donnelly 8 months ago
Neeraj Pratap Singh wrote:
Patrick Donnelly wrote:
Venky Shankar wrote:
Neeraj Pratap Singh wrote:
Venky Shankar wrote:
Patrick Donnelly wrote:
Venky Shankar wrote:
Patrick,
Going by the description here, I assume this change is only for the volumes plugin. In case the changes are general in ceph-mgr that could possibly effect other plugins, we would want to make this behaviour optional (but, enabled for volume plugin if ceph-csi is fine with it).
I think it should be common to all module finishers, implemented at a higher level. Because the volumes plugin finisher thread may be stuck, it's not possible to implement this reliably in the volumes plugin itself.
Right, I didn't mean to implement this at the module level, but given than mgr/volumes is where the slowness is majorly from, should we keep the behaviour (returning -EAGAIN when the request queue is loaded) for mgr/volumes rather than for all plugins?
I guess we are waiting here on Patrick, or we have reached to any conclusion?
IMO the way forward is clear -- the change needs to be generic for all plugin finishers (we have one per plugin now). We need to decide on a sensible (default) threshold for the queue len.
It can be an opt-in behavior for modules. Have the module call some method to set a field in its ActivePyModule to maintain a limit on its request queue (as described in this ticket).
Just for clarity, do you mean to say that we should be having a separate method for setting the limit value for the modules' request queue?
That would make sense to me.