Project

General

Profile

Actions

Bug #46846

closed

Prometheus metrics contain stripped/incomplete ipv6 address

Added by Daniƫl Vos over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Matthew Oliver
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
prometheus metrics ipv6
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

curl --silent http://localhost:9283/metrics | grep ceph_mon_metadata{
ceph_mon_metadata{ceph_daemon="mon.mon2",hostname="mon2.example.net",public_addr="[2001",rank="0",ceph_version="ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)"} 1.0
ceph_mon_metadata{ceph_daemon="mon.mon3",hostname="mon3.example.net",public_addr="[2001",rank="1",ceph_version="ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)"} 1.0
ceph_mon_metadata{ceph_daemon="mon.mon1",hostname="mon1.example.net",public_addr="[2001",rank="2",ceph_version="ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)"} 1.0

The public_addr is stripped on the first semicolon.

Problem also exists on the `ceph_osd_metadata` metric for the `cluster_addr` label

ceph_osd_metadata{back_iface="bond-storage",ceph_daemon="osd.0",cluster_addr="[2001",device_class="ssd",front_iface="bond-mgmt",hostname="node1.example.net",objectstore="bluestore",public_addr="[2001",ceph_version="ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)"} 1.0
ceph_osd_metadata{back_iface="bond-storage",ceph_daemon="osd.1",cluster_addr="[2001",device_class="ssd",front_iface="bond-mgmt",hostname="node1.example.net",objectstore="bluestore",public_addr="[2001",ceph_version="ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)"} 1.0
ceph_osd_metadata{back_iface="bond-storage",ceph_daemon="osd.2",cluster_addr="[2001",device_class="ssd",front_iface="bond-mgmt",hostname="node1.example.net",objectstore="bluestore",public_addr="[2001",ceph_version="ceph version 15.2.4 (7447c15c6ff58d7fce91843b705a268a1917325c) octopus (stable)"} 1.0

Unfortunately I do not use the ganesha nfs/cephfs/rgw features of ceph but it would not surprise me if those metrics contained the same bug on an ipv6 cluster


Related issues 2 (0 open2 closed)

Copied to mgr - Backport #47281: nautilus: Prometheus metrics contain stripped/incomplete ipv6 addressResolvedKonstantin ShalyginActions
Copied to mgr - Backport #47282: octopus: Prometheus metrics contain stripped/incomplete ipv6 addressResolvedKonstantin ShalyginActions
Actions #1

Updated by Matthew Oliver over 3 years ago

Looks like a splitting issue to me, not sure this is actaully messenger related, it's prometheus module related. After a quick look, we probably just need to use `rsplit` rather then the normal split to make this work in the python code, ie something like:

diff --git a/src/pybind/mgr/prometheus/module.py b/src/pybind/mgr/prometheus/module.py
index 83fe6c3af0..e9c734b8f8 100644
--- a/src/pybind/mgr/prometheus/module.py
+++ b/src/pybind/mgr/prometheus/module.py
@@ -510,7 +510,7 @@ class Module(MgrModule):
             host_version = servers.get((id_, 'mon'), ('', ''))
             self.metrics['mon_metadata'].set(1, (
                 'mon.{}'.format(id_), host_version[0],
-                mon['public_addr'].split(':')[0], rank,
+                mon['public_addr'].rsplit(':', 1)[0], rank,
                 host_version[1]
             ))
             in_quorum = int(rank in mon_status['quorum'])
@@ -619,8 +619,8 @@ class Module(MgrModule):
             # id can be used to link osd metrics and metadata
             id_ = osd['osd']
             # collect osd metadata
-            p_addr = osd['public_addr'].split(':')[0]
-            c_addr = osd['cluster_addr'].split(':')[0]
+            p_addr = osd['public_addr'].rsplit(':', 1)[0]
+            c_addr = osd['cluster_addr'].rsplit(':', 1)[0]
             if p_addr == "-" or c_addr == "-":
                 self.log.info(
                     "Missing address metadata for osd {0}, skipping occupation" 
Actions #2

Updated by Matthew Oliver over 3 years ago

  • Pull request ID set to 36594

Pushed this diff up as a PR. It would be great if it could be tested as I don't have a prometheus env setup. Though I guess I could go down that rabbit hole.

https://github.com/ceph/ceph/pull/36594

Actions #3

Updated by Nathan Cutler over 3 years ago

  • Status changed from New to Fix Under Review
  • Assignee set to Matthew Oliver
Actions #4

Updated by Nathan Cutler over 3 years ago

  • Backport set to octopus
Actions #5

Updated by Nathan Cutler over 3 years ago

  • Project changed from Messengers to mgr

Assuming it was filed under "Messengers" by mistake.

Actions #6

Updated by Jan Fajerski over 3 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport changed from octopus to octopus,nautilus
Actions #7

Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #47281: nautilus: Prometheus metrics contain stripped/incomplete ipv6 address added
Actions #8

Updated by Nathan Cutler over 3 years ago

  • Copied to Backport #47282: octopus: Prometheus metrics contain stripped/incomplete ipv6 address added
Actions #9

Updated by Nathan Cutler over 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF