Project

General

Profile

Bug #42930

mgr/dashboard: remove alert integration of prometheus

Added by Patrick Seidensal over 4 years ago. Updated almost 3 years ago.

Status:
Won't Fix
Priority:
Normal
Category:
Monitoring
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Currently, the dashboard has two integrations to be notified about firing alerts. According to the docs, this is

1. the integration with the Prometheus API and
2. a receiver implementation in the Dashboard to be used by the Alertmanager (push notifications through Alertmanagers' webhook).

Both integration can be used independently and combined, but offer a different set of features when used solely. Both integration though offer the ability to receive alerts, which seems like a mistake to me. Prometheus' alerts shouldn't be consumed directly by the dashboard.

Due to the current implementation, as soon as the Prometheus API is used (that is, solely used or in conjunction with the Alertmanager), the dashboard will invariably continue to receive all alerts every 30 seconds to 120 seconds unless those alerts are silenced, which prevents further distribution as configured in the Alertmanager.

Using the Alertmanager instead for receiving alerts enables the user to configure explicitly which alerts are supposed to be received by the dashboard, alongside other functionality provided by the Alertmanager, namely deduplication and grouping. Notifications sent out once won't be resent to the dashboard every 30 to 120 seconds, but are withheld for the configured amount of time (default is 3h).

History

#1 Updated by Stephan Müller over 4 years ago

I would suggest to remove the webhook of the notifications send out from the Alertmanager, as we still need to be able to access both API's as they provide different things.

Currently the Prometheus API is only used in the silence form for the matcher to know if the silence would match at least an alert definition.

You don't need to specify both API's but you can to get the most benefit from Prometheus.

In future we also wan't to show alerts that are not fired ATM where we need to have access to the Prometheus API (#42877).


To the description you are receiving them so quickly because it's defined by the alerting and Prometheus rules you are using. Normally an alerts should not be firing 24/7, but it's good for us as we want to test it.

I, for example, only activate Prometheus if I test something within Prometheus.

FYI if I say Prometheus I mean Prometheus and the Alertmanager as they are tiddly coupled.

IMO it's not pretty clear what you want to do.

Do you want to remove all alerting functionality and use the webhook instead? Than nothing would be left than some notifications about alerts, as we can't put them into a list.

Do you want to remove the notifications send out by the dashboard if an alert changes? Than we could not notify users that use the API based solution instead as the webhook configuration can be pretty confusing at least if TLS is in use (which is default for the dashboard).

#2 Updated by Patrick Seidensal almost 4 years ago

  • Status changed from New to Won't Fix

Thanks for the clarification!

#3 Updated by Ernesto Puerta almost 3 years ago

  • Project changed from mgr to Dashboard
  • Category changed from 148 to Monitoring

Also available in: Atom PDF