Project

General

Profile

Actions

Bug #40008

closed

mgr/dashboard: Need a method to check references to Grafana dashboards are correct or exist

Added by Kiefer Chang almost 5 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
Monitoring
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Ceph Dashboard embeds Grafana dashboards by specifying uid in <cd-grafana> components.
If a Grafana dashboard is updated with uid changed, uid property in corresponding <cd-grafana> component should also be updated.
We need a method to ensure these references are correct, or at least exist.

In the short term, a script can be added to check a <cd-grafana> always refer to a existing Grafana dashboard.
In the long term, we can add some e2e tests to verify correctness of embedded dashboards.


Related issues 2 (0 open2 closed)

Related to Dashboard - Bug #39971: Several embedded Grafana dashboards are not displayed due to changed uidsResolvedKiefer Chang

Actions
Copied to Dashboard - Backport #42956: nautilus: mgr/dashboard: Need a method to check references to Grafana dashboards are correct or existResolvedKiefer ChangActions
Actions #1

Updated by Kiefer Chang almost 5 years ago

  • Related to Bug #39971: Several embedded Grafana dashboards are not displayed due to changed uids added
Actions #2

Updated by Kiefer Chang almost 5 years ago

  • Subject changed from Need a method to check references to Grafana Dashboards are correct or exist to Need a method to check references to Grafana dashboards are correct or exist
Actions #3

Updated by Kiefer Chang almost 5 years ago

  • Assignee set to Kiefer Chang
Actions #4

Updated by Kiefer Chang almost 5 years ago

  • Status changed from New to In Progress

Short term solution: check every <cd-grafana> component has a mapped Grafana dashboard

My proposals are below:

Approach 1

A script is created to check mappings (https://github.com/bk201/ceph/blob/5cc01a2b91c7eb1950932cf24a69e7aed4a713f3/src/pybind/mgr/dashboard/tools/check_grafana_references.py):

Running this script to detect if there is any mismatch between two parties:

$ cd /ceph/src/pybind/mgr/dashboard
$ python tools/check_grafana_references.py frontend/src/app ../../../../monitoring/grafana
Extract <cd-grafana> components and check UIDs
Found mappings:
-xyV8KCiz (frontend/src/app/ceph/pool/pool-details/pool-details.component.html:14)
    -> Ceph Pool Details (../../../../monitoring/grafana/dashboards/pool-detail.json)
41FrpeUiz (frontend/src/app/ceph/block/rbd-images/rbd-images.component.html:15)
    -> RBD Overview (../../../../monitoring/grafana/dashboards/rbd-overview.json)
tbO9LAiZz (frontend/src/app/ceph/cephfs/cephfs-detail/cephfs-detail.component.html:46)
    -> MDS Performance (../../../../monitoring/grafana/dashboards/cephfs-overview.json)
WAkugZpiz (frontend/src/app/ceph/rgw/rgw-daemon-list/rgw-daemon-list.component.html:17)
    -> RGW Overview (../../../../monitoring/grafana/dashboards/radosgw-overview.json)
x5ARzZtmk (frontend/src/app/ceph/rgw/rgw-daemon-details/rgw-daemon-details.component.html:17)
    -> RGW Instance Detail (../../../../monitoring/grafana/dashboards/radosgw-detail.json)
y0KGL0iZz (frontend/src/app/ceph/cluster/hosts/hosts.component.html:30)
    -> Host Overview (../../../../monitoring/grafana/dashboards/hosts-overview.json)
rtOg0AiWz (frontend/src/app/ceph/cluster/hosts/host-details/host-details.component.html:4)
    -> Host Details (../../../../monitoring/grafana/dashboards/host-details.json)
CrAHE0iZz (frontend/src/app/ceph/cluster/osd/osd-details/osd-details.component.html:47)
    -> OSD device details (../../../../monitoring/grafana/dashboards/osd-device-details.json)
lo02I1Aiz (frontend/src/app/ceph/cluster/osd/osd-list/osd-list.component.html:72)
    -> OSD Overview (../../../../monitoring/grafana/dashboards/osds-overview.json)

Components that have no mapped Grafana dashboards:
z99hzWtmh (frontend/src/app/ceph/pool/pool-list/pool-list.component.html:36)

Checking Grafana dashboards UIDs: ERROR

Approach 2 (WIP, suggested by Kanika)
  1. <cd-grafana> component does not refer uid directly.
  2. Add a new input property `grafana-dashboard-name`. uid of Grafana dashboard is resolved by referring to another mapping file during initialization of component.
  3. The mapping file contains a object that maps from all dashboard names to Grafana uids.
  4. A script is created to parse the mapping file and check if Grafana dashboards can be found by uids.
Benefits of this approach
  • All mappings are consolidated in a mapping file.
  • If Grafana dashboards are updated, a developer only needs to update this mapping file. No need to jumping around for all <cd-grafana> components.
Actions #5

Updated by Lenz Grimmer almost 5 years ago

Thanks for your proposal, much appreciated. If your first approach can be integrated with "make check", this would be a good first step in making sure we're not running into the same issue again. The second option is much cleaner and would be my preference - depending on how long it would take to implement these changes, we should consider taking that approach right away.

Actions #6

Updated by Lenz Grimmer almost 5 years ago

  • Translation missing: en.field_tag_list set to testing, qa, grafana
  • Affected Versions v14.2.0, v14.2.1 added
Actions #7

Updated by Kiefer Chang almost 5 years ago

  • Pull request ID set to 28234

Create PR for approach 1.

Actions #8

Updated by Lenz Grimmer almost 5 years ago

  • Subject changed from Need a method to check references to Grafana dashboards are correct or exist to mgr/dashboard: Need a method to check references to Grafana dashboards are correct or exist
  • Status changed from In Progress to Fix Under Review
Actions #9

Updated by Lenz Grimmer almost 5 years ago

A PR to implement approach 1 has been merged now. Should this one be backported to Nautilus, to capture any regressions?
Do we still plan to implement approach 2 ?

Actions #10

Updated by Kiefer Chang almost 5 years ago

Lenz Grimmer wrote:

A PR to implement approach 1 has been merged now. Should this one be backported to Nautilus, to capture any regressions?

If we always cherry-pick changes from master back to stable branches, we can say the regression possibility is low because the check is already done on master.
I can help with backporting this.

Do we still plan to implement approach 2 ?

I'd suggest yes, should I create a new issue?

Actions #11

Updated by Kiefer Chang over 4 years ago

  • Status changed from Fix Under Review to Resolved
  • Backport set to nautilus
Actions #12

Updated by Kiefer Chang over 4 years ago

  • Status changed from Resolved to Pending Backport
Actions #13

Updated by Nathan Cutler over 4 years ago

  • Copied to Backport #42956: nautilus: mgr/dashboard: Need a method to check references to Grafana dashboards are correct or exist added
Actions #14

Updated by Nathan Cutler about 4 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions #15

Updated by Ernesto Puerta almost 3 years ago

  • Tracker changed from Fix to Bug
  • Project changed from mgr to Dashboard
  • Category changed from 148 to Monitoring
  • Regression set to No
  • Severity set to 3 - minor
Actions

Also available in: Atom PDF