Project

General

Profile

Bug #43680

parallelize osd provisioning

Added by Sebastian Wagner 6 months ago. Updated 4 months ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
cephadm
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

parallelism of the ssh orchestrator is not trivial:

https://github.com/ceph/ceph/blob/713db2994dd262a687612783c85f7929e9041d9c/src/pybind/mgr/ssh/module.py#L272

sets the thread pool size to 1, which is really suboptimal. Changing this requires something like Lock to prevent conflicts

But this is still not really sufficient, if we have a big cluster, as we cannot increase this limit indefinitely as we're talking about kernel threads here. We can improve this by using something like https://pypi.org/project/parallel-ssh/ instead of remoto.

Regarding concurrency, there is a open PR ceph/ceph: Pull Request 26565 to allow executing multiple commands concurrent

So, these are the options we have right now:

| What                                   | Effort | Result |
|----------------------------------------|--------|---------------------|
| increase thread pool size to 10        | small  | limited improvement |
| https://pypi.org/project/parallel-ssh/ | big    | big improvement     |
| pyzmq instead of SSH                   | huge   | everything parallel |

History

#1 Updated by Sebastian Wagner 6 months ago

  • Target version set to v15.0.0

#2 Updated by Sebastian Wagner 5 months ago

  • Priority changed from Normal to High

#3 Updated by Sebastian Wagner 4 months ago

  • Status changed from New to Resolved
  • Pull request ID set to 33463

Also available in: Atom PDF