Fix #51177
closedpybind/mgr/volumes: investigate moving calls which may block on libcephfs into another thread
0%
Description
To not block the ceph-mgr finisher thread on any calls out to cephfs. This can have disastrous consequences as the mgr will then stop functioning.
(I think it may need changes to ceph-mgr to support this. Namely, the core ceph-mgr code handles the return from the command and creates a reply message. We need that reply message to be constructed within the module and sent out.)
Updated by Vikhyat Umrao over 1 year ago
Downstream BZ - https://bugzilla.redhat.com/show_bug.cgi?id=2114615
Updated by Venky Shankar over 1 year ago
- Assignee set to Kotresh Hiremath Ravishankar
- Target version set to v18.0.0
- Backport changed from pacific to pacific,quincy
Kotresh, please take a look at this.
Updated by Venky Shankar over 1 year ago
Spoke to Kotersh today - we may want to introduce an async command execution interface in plugins that the finisher thread would call and "handover" the request to be replied back by the plugin itself. Plugins can choose to implement this execution "mode" or use the default blocking call by the finisher thread.
Updated by Kotresh Hiremath Ravishankar over 1 year ago
- Pull request ID set to 47893
Discussion Summary with Patrick
1. Have a thread for each module to execute module commands. Since the finisher thread infrastructure is already in place, it's better to use one finisher thread per module.
Currently with the draft PR, there is one finisher thread per module and one generic finisher thread via which all other things like config, notify is done. This is different from the
comment 4. Both has it's pros and cons. With this approach, if any command is stuck in a python module, only that module is affected (the subsequent module commands waits) and other
module commands goes through. This is comparatively easy to implement. The comment 4's approach needs change in every python module and the effect of asynchronous nature is to be tested
as all module commands is asynchronous.
2. Add a warning if the finisher thread's queue is growing or if the it takes more than 15 secs for the single command.
3. Add extensive performance counters for osdmap, mdsmap changes, command processed (op time, command count)
Updated by Kotresh Hiremath Ravishankar over 1 year ago
- Status changed from New to In Progress
Updated by Venky Shankar over 1 year ago
- Status changed from In Progress to Fix Under Review
Updated by Venky Shankar about 1 year ago
- Status changed from Fix Under Review to Pending Backport
- Backport changed from pacific,quincy to reef,quincy,pacific
Updated by Backport Bot about 1 year ago
- Copied to Backport #59415: reef: pybind/mgr/volumes: investigate moving calls which may block on libcephfs into another thread added
Updated by Backport Bot about 1 year ago
- Copied to Backport #59416: quincy: pybind/mgr/volumes: investigate moving calls which may block on libcephfs into another thread added
Updated by Backport Bot about 1 year ago
- Copied to Backport #59417: pacific: pybind/mgr/volumes: investigate moving calls which may block on libcephfs into another thread added
Updated by Patrick Donnelly 11 months ago
- Target version changed from v18.0.0 to v19.0.0
Updated by Patrick Donnelly 8 months ago
- Status changed from Pending Backport to Resolved