Actions
Feature #57962
openceph-mixin: Add Prometheus Alert for Degraded Bond
Status:
Pending Backport
Priority:
Normal
Assignee:
Category:
Monitoring/Alerting
Target version:
-
% Done:
100%
Source:
Tags:
backport_processed
Backport:
quincy
Reviewed:
Affected Versions:
Pull request ID:
Description
Currently there is no alert for a network interface card to be misconfigured or failed which is part of a network bond.
This could lead to redundancies and performance being degraded unnoticed.
To solve this, I use node exporter metrics to look at the number of total peers of the bond and the ones that are active. If the numbers differ, something is up and should be looked at.
The Patch can be found here: https://github.com/ceph/ceph/pull/48538
This is the reference ticket for it. I am not quite sure, what the correct target version is since the PR targets main. The patch should be fairly easy to backport since it only changes json and yaml files for Prometheus.
Updated by Nizamudeen A over 1 year ago
- Status changed from New to Pending Backport
- Assignee set to Christian Kugler
- % Done changed from 70 to 100
- Backport set to quincy
Updated by Backport Bot over 1 year ago
- Copied to Backport #57981: quincy: ceph-mixin: Add Prometheus Alert for Degraded Bond added
Actions