Project

General

Profile

Actions

Feature #52638

closed

mgr/prometheus: Add all healthchecks to prometheus output and provide a way of viewing history

Added by Paul Cuzner over 2 years ago. Updated about 2 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
prometheus module
Target version:
% Done:

0%

Source:
Tags:
Backport:
pacific
Reviewed:
Affected Versions:
Pull request ID:

Description

The mgr/prometheus module does not provide a granular view of healthchecks to Prometheus, which means some alerts rely on the generic ceph_health_status > 0 expression. This is not very informative and common source of frustration.

This feature provides a metric per encountered healthcheck, so alert rules can be customised to specific events. In addition since the healthchecks need to be emitted on each scrape, the module needs to persist healthcheck state, which opens the door to providing a healthcheck history. The history would be exposed by a new command, allowing the admin the ability to see what healthchecks have been encountered within the cluster, their frequency and the first and last seen timestamps.

The feature deliverables should include
- additional metrics
- updated prometheus rules
- updated docs for the prometheus module


Related issues 1 (0 open1 closed)

Copied to mgr - Backport #53616: pacific: mgr/prometheus: Add all healthchecks to prometheus output and provide a way of viewing historyResolvedAvan ThakkarActions
Actions #1

Updated by Ernesto Puerta over 2 years ago

  • Status changed from New to Pending Backport
  • Pull request ID set to 43293
Actions #2

Updated by Backport Bot over 2 years ago

  • Copied to Backport #53616: pacific: mgr/prometheus: Add all healthchecks to prometheus output and provide a way of viewing history added
Actions #3

Updated by Ernesto Puerta about 2 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF