Project

General

Profile

Actions

Bug #56514

open

Tasks #36451: mgr/dashboard: Scalability testing

mgr/dashboard: paginate alerts

Added by Ernesto Puerta almost 2 years ago. Updated almost 2 years ago.

Status:
New
Priority:
Normal
Category:
Monitoring
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This was an unknown unknown: since we didn't expect this to become such a massive endpoint. The thing is that some alerts are triggered PER daemon, so issues affecting OSDs will trigger x number_of_OSDs (8k entries for an 8k OSD cluster with 1 alert... but if an OSD triggers 2,3 alerts, then than number will multiply by 2, 3 to 16, 24k items).

Target cout: 32k alerts.

Strategy: Prometheus Alerts API (https://prometheus.io/docs/prometheus/latest/querying/api/#alerts).

From Pawsey:
  • "The generic table does work that well across all the use cases. For example in the monitoring it would make sense to have a checkboxes for active/suppressed for example. As it stands you have an overlay number on the monitoring menu item indicating the number of issues, but when you look at the alerts you see suppressed as well!"
Actions #1

Updated by Ernesto Puerta almost 2 years ago

  • Description updated (diff)
Actions

Also available in: Atom PDF