Feature #24977: Provide a base set of Prometheus alert manager rules that notify the user about common Ceph error conditions - mgr - Ceph

Actions

Copy link

Feature #24977

closed

Provide a base set of Prometheus alert manager rules that notify the user about common Ceph error conditions

Added by Lenz Grimmer almost 6 years ago. Updated almost 5 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Jan Fajerski

Category:

prometheus module

Target version:

Ceph - v15.0.0

% Done:

Source:

Tags:

Backport:

nautilus

Reviewed:

Affected Versions:

Pull request ID:

27596

Description

We should create a number of pre-defined Prometheus alert manager configuration files that trigger alerts on specific Ceph error conditions. At a minimum, the following conditions should trigger an alert:

Change in Ceph Cluster health (state change)
Disks near full
OSDs that are down
OSD hosts that are down
OSD Host Loss Check
Slow OSD response
OSDs with High PG Count
PGs stuck
Network Packet Drops and Errors
Pool capacity utilization
MONs Down (state change)
Cluster Capacity Utilization
Capacity forecast warning (if capacity would be exhausted within 6 weeks)

These alert manager configuration files should be included in the upstream Ceph code base as a reference implementation.

Related issues 2 (0 open — 2 closed)

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » mgr

Custom queries

Feature #24977

Provide a base set of Prometheus alert manager rules that notify the user about common Ceph error conditions

Updated by Jan Fajerski almost 6 years ago

Updated by Anonymous over 5 years ago

Updated by Tobias Florek over 5 years ago

Updated by Jan Fajerski over 5 years ago

Updated by Lenz Grimmer over 5 years ago

Updated by Lenz Grimmer over 5 years ago

Updated by Ernesto Puerta about 5 years ago

Updated by Jan Fajerski about 5 years ago

Updated by Lenz Grimmer about 5 years ago

Updated by Lenz Grimmer about 5 years ago

Updated by Nathan Cutler about 5 years ago

Updated by Nathan Cutler almost 5 years ago