Project

General

Profile

Feature #57962

ceph-mixin: Add Prometheus Alert for Degraded Bond

Added by Christian Kugler 3 months ago. Updated 3 months ago.

Status:
Pending Backport
Priority:
Normal
Category:
Monitoring/Alerting
Target version:
-
% Done:

100%

Source:
Tags:
backport_processed
Backport:
quincy
Reviewed:
Affected Versions:
Pull request ID:

Description

Currently there is no alert for a network interface card to be misconfigured or failed which is part of a network bond.

This could lead to redundancies and performance being degraded unnoticed.

To solve this, I use node exporter metrics to look at the number of total peers of the bond and the ones that are active. If the numbers differ, something is up and should be looked at.

The Patch can be found here: https://github.com/ceph/ceph/pull/48538
This is the reference ticket for it. I am not quite sure, what the correct target version is since the PR targets main. The patch should be fairly easy to backport since it only changes json and yaml files for Prometheus.


Related issues

Copied to Ceph - Backport #57981: quincy: ceph-mixin: Add Prometheus Alert for Degraded Bond New

History

#1 Updated by Nizamudeen A 3 months ago

  • Status changed from New to Pending Backport
  • Assignee set to Christian Kugler
  • % Done changed from 70 to 100
  • Backport set to quincy

#2 Updated by Backport Bot 3 months ago

  • Copied to Backport #57981: quincy: ceph-mixin: Add Prometheus Alert for Degraded Bond added

#3 Updated by Backport Bot 3 months ago

  • Tags set to backport_processed

Also available in: Atom PDF