Tasks #47369
closed
Ceph scales to 100's of hosts, 1000's of OSDs....can orchestrator?
Added by Paul Cuzner over 3 years ago.
Updated over 2 years ago.
Description
This bug is intended to help us identify areas in mgr/orchestrator, mgr/cephadm, cephadm, mgr/prometheus and mgr/rook which represent potential scale bottlenecks.
For example
- list_daemons
- ceph_volume execution time
- orchestrator task parallelization
- lack of dashboard pagination
- dashboard prefetch strategy for 100's or 1000s of OSDs, RBD images, buckets
- mgr/prometheus reporting on 1000's OSDs or 100's hosts
- installation bandwidth (the demand image pull places on the network)
There have been trackers in the past that focus on a specific areas (https://tracker.ceph.com/issues/36451), but it would be great if we could look at scale issues holistically..
Please feel free to add information to this tracker which documents scale issues with the management layer, that need to be considered.
A single Prometheus instance can, on a properly sized host, handle 1000 nodes. As an OSD is usually accompanied by other OSDs on a host, this is not an issue. Customers with 1000 OSD clusters haven't had any issues with Prometheus, though the Prometheus manager module made some difficulties. But in the meantime the cache of the Prometheus manager module has been overhauled as well as some patches have been contributed on the Ceph's side to improve performance. Since then there haven't been any issues that I'm aware of.
- Related to Feature #47368: Provide a daemon mode for cephadm to handle host/daemon state requests added
- Related to Tasks #36451: mgr/dashboard: Scalability testing added
- Tracker changed from Bug to Tasks
yes, we have users with > 1000 osds. that works already :-)
I was talking with Yaarit for getting real figures from Telemetry, and she mentioned the following ones:
I asked Yaarit if they can publish/gather more of this kind of scale factors. I think this can be very useful to drive any effort on this area.
There was an issue with bucket aggregation in the heatmap panel, so instead of 7 clusters with 4,096 - 5,792 OSDs each, there is 1 cluster (which was 1 * 7 days).
Sorry about that; I fixed it for now by forcing a 1 day interval size, but the better solution would be to have an 'average' option in Grafana for a given time interval (instead of the implicit 'sum' operation).
- Status changed from New to Resolved
Also available in: Atom
PDF