Bug #55096
mgr/dashboard: Grafana dashboard: "matching labels must be unique on one side"
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Tags:
Description
Description of problem¶
When:
- using an external/remote Prometheus scraper
- the radosgw-overview.json (https://github.com/ceph/ceph/blob/master/monitoring/ceph-mixin/dashboards_out/radosgw-overview.json) dashboard in Grafana
- with more than one RGW in the cluster
The following graphs:
- Average GET/PUT Latencies
- Total Requests/sec by RGW Instance
- GET Latencies by RGW Instance
- Bandwidth by RGW Instance
- PUT Latencies by RGW Instance
all fail to render data with the following error:
execution: found duplicate series for the match group {} on the right hand-side of the operation: [{__name__="ceph_rgw_metadata", ceph_daemon="rgw.182475", ceph_version="ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)", hostname="cephstor001.DOMAIN.TLD", instance="cephmon002.DOMAIN.TLD:9283", job="ceph"}, {__name__="ceph_rgw_metadata", ceph_daemon="rgw.123843", ceph_version="ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)", hostname="cephstor002.DOMAIN.TLD", instance="cephmon002.DOMAIN.TLD:9283", job="ceph"}];many-to-many matching not allowed: matching labels must be unique on one side
This seems to be similar to https://tracker.ceph.com/issues/49433 and https://tracker.ceph.com/issues/47334 and likely just needs the query adjusted.
Environment¶
ceph version
string: ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)- Platform (OS/distro/release): AlmaLinux 8.5
- Cluster details (nodes, monitors, OSDs): 6 nodes total (3 combined MON/MGR, 3 combined OSD host/RGW/MDS)
- Did it happen on a stable environment or after a migration/upgrade?: stable
- Browser used (e.g.:
Version 86.0.4240.198 (Official Build) (64-bit)
): Firefox 97.0.1 (64-bit) (but reproducible on all browsers)
How reproducible¶
Steps:
See description.
Actual results¶
See above.
Expected results¶
Metrics are displayed.
Additional info¶
N/A
History
#1 Updated by brent s. over 1 year ago
Note that the same occurs on the RGW Instance Detail (https://github.com/ceph/ceph/blob/master/monitoring/ceph-mixin/dashboards_out/radosgw-detail.json) dashboard; it experiences the same error/behavior as well for all of its graphs:
- All GET/PUT Latencies
- Bandwidth by HTTP Operation
- HTTP Request Breakdown
- Workload Breakdown
#2 Updated by Ernesto Puerta over 1 year ago
- Tags set to regression
- Status changed from New to Triaged
- Assignee changed from Aashish Sharma to Nizamudeen A
- Source set to Community (user)
Reproduced in master (Nizam), in cephadm deployments but not in ceph-dev ones.