Project

General

Profile

Bug #56514

Tasks #36451: mgr/dashboard: Scalability testing

mgr/dashboard: paginate alerts

Added by Ernesto Puerta 5 months ago. Updated 5 months ago.

Status:
New
Priority:
Normal
Category:
Monitoring
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Tags:

Description

This was an unknown unknown: since we didn't expect this to become such a massive endpoint. The thing is that some alerts are triggered PER daemon, so issues affecting OSDs will trigger x number_of_OSDs (8k entries for an 8k OSD cluster with 1 alert... but if an OSD triggers 2,3 alerts, then than number will multiply by 2, 3 to 16, 24k items).

Target cout: 32k alerts.

Strategy: Prometheus Alerts API (https://prometheus.io/docs/prometheus/latest/querying/api/#alerts).

From Pawsey:
  • "The generic table does work that well across all the use cases. For example in the monitoring it would make sense to have a checkboxes for active/suppressed for example. As it stands you have an overlay number on the monitoring menu item indicating the number of issues, but when you look at the alerts you see suppressed as well!"

History

#1 Updated by Ernesto Puerta 5 months ago

  • Description updated (diff)

Also available in: Atom PDF