Bug #58555
openExternal Prometheus support
0%
Description
As monitoring user i want to configure my external Prometheus server only one time, so I can monitor several different Ceph clusters and not having any issues if any of the clusters is updated, or the active manager is changed.
What is the problem now:
As monitoring user i have configured my Prometheus server using a prometheus configuration file that defines scrape endpoint using a service discovery url. I have realized that when there is a change in the active manager, the scrape endpoints are not valid, and therefore the metrics from the cluster are not available anymore.
Example:
Excerpt of the Prometheus configuration file:
scrape_configs:
...
- job_name: 'cephcluster01'
honor_labels: true
http_sd_configs:
- url: https://cephcluster01-node01:8765/sd/prometheus/sd-config?service=mgr-prometheus
tls_config:
ca_file: root_cert.pem
...
cephcluster01-node01 is the node where the active manager is running, but this is just right in this moment... so if the active manager changes,,, the scrape config is not valid, and i lose the metrics.
what is the value for the user:
Configure only one time and forever the external prometheus server used to collect the Ceph cluster metrics
Updated by Redouane Kachach Elhichou 10 months ago
- Assignee deleted (
Juan Miguel Olmo MartÃnez)