Project

General

Profile

Actions

Bug #58555

open

External Prometheus support

Added by Juan Miguel Olmo Martínez over 1 year ago. Updated 10 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
cephadm/monitoring
Target version:
-
% Done:

0%

Source:
Development
Tags:
monitoring
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

As monitoring user i want to configure my external Prometheus server only one time, so I can monitor several different Ceph clusters and not having any issues if any of the clusters is updated, or the active manager is changed.

What is the problem now:
As monitoring user i have configured my Prometheus server using a prometheus configuration file that defines scrape endpoint using a service discovery url. I have realized that when there is a change in the active manager, the scrape endpoints are not valid, and therefore the metrics from the cluster are not available anymore.

Example:

Excerpt of the Prometheus configuration file:
scrape_configs:

  ...
  - job_name: 'cephcluster01'     
    honor_labels: true     
    http_sd_configs:
    - url: https://cephcluster01-node01:8765/sd/prometheus/sd-config?service=mgr-prometheus
      tls_config:
         ca_file: root_cert.pem
  ...

cephcluster01-node01 is the node where the active manager is running, but this is just right in this moment... so if the active manager changes,,, the scrape config is not valid, and i lose the metrics.

what is the value for the user:
Configure only one time and forever the external prometheus server used to collect the Ceph cluster metrics

Actions #1

Updated by Redouane Kachach Elhichou 10 months ago

  • Assignee deleted (Juan Miguel Olmo Martínez)
Actions

Also available in: Atom PDF