Project

General

Profile

Bug #57018

host.containers.internal accessing grafana's performance graphs

Added by Carlos Mogas da Silva over 1 year ago. Updated 12 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Description of problem

When I try to access any of the grafana dashboards via Ceph Dashboard, I get an "can't resolve host.containers.internal" error.

Environment

  • ceph version string: ceph version 17.2.3 (dff484dfc9e19a9819f375586300b3b79d80034d) quincy (stable)
  • Platform (OS/distro/release): CentOS 9 Stream
  • Cluster details (nodes, monitors, OSDs):
    1. ceph status
      cluster:
      id: 46e53510-0eb4-11ed-9dea-8600001a8b74
      health: HEALTH_OK

      services:
      mon: 2 daemons, quorum cephadm,ceph01 (age 2d)
      mgr: cephadm.wjeuos(active, since 2d), standbys: ceph01.vgycgk
      mds: 1/1 daemons up, 1 standby
      osd: 2 osds: 2 up (since 24h), 2 in (since 24h)

      data:
      volumes: 1/1 healthy
      pools: 4 pools, 81 pgs
      objects: 144 objects, 423 MiB
      usage: 949 MiB used, 7.3 TiB / 7.3 TiB avail
      pgs: 81 active+clean

      io:
      client: 341 B/s wr, 0 op/s rd, 0 op/s wr

  • Did it happen on a stable environment or after a migration/upgrade?: stable (new install)
  • Browser used (e.g.: Version 86.0.4240.198 (Official Build) (64-bit)): Firefox 102.0 (64-bit) && Brave 1.41.100 (Chromium 103.0.5060.134) (64-bit)

How reproducible

Steps:

1. Deploy new cluster following steps outlined at https://docs.ceph.com/en/quincy/cephadm/install/.
2. Access the dashboard
3. Try to open any graph provided by grafana

Actual results

the iframe shows "host.containers.internal not found"

Expected results

theh graphs should appear

Additional info

  1. ceph config dump | grep containers
    WHO MASK LEVEL OPTION VALUE RO
    mgr advanced mgr/dashboard/ALERTMANAGER_API_HOST http://host.containers.internal:9093 *
    mgr advanced mgr/dashboard/GRAFANA_API_URL https://host.containers.internal:3000 *
    mgr advanced mgr/dashboard/PROMETHEUS_API_HOST http://host.containers.internal:9095 *

Related issues

Related to Orchestrator - Bug #58532: cephadm: "host.containers.internal" iscsi GW entry is created automatically in cephadm iscsi deployments Resolved

History

#1 Updated by Brian Woods over 1 year ago

I am also seeing this issue on my cephadm deployment on RHEL 8 hosts.
I am NOT seeing this issue on my Ubuntu 20.04 cluster.

#2 Updated by Brian Woods over 1 year ago

I was able to resolve my config from the shell and:

ceph dashboard set-alertmanager-api-host http://...
ceph dashboard set-grafana-api-url https://...
ceph dashboard set-prometheus-api-host http://...

No idea why these are not being set correctly during deployment.

#3 Updated by Carlos Mogas da Silva over 1 year ago

Brian Woods wrote:

I was able to resolve my config from the shell and:

[...]

No idea why these are not being set correctly during deployment.

I just didn't set them like that because I don't know if they should be dynamic. For example, what happens if the active mgr changes host?

#4 Updated by Brian Woods over 1 year ago

Agreed, I assume these setting won't update after manually setting to allow for load balancers. But this is a demo cluster for me, so it didn't matter, and I wanted to test it for NLBs anyway.

#5 Updated by Patrik Fürer over 1 year ago

I see similar behaviour here and this started with podman 4.1 where podman is injecting an entry into /etc/hosts inside the container with the IP of the host where the pod is running and the name host.containers.internal.
This is a real problem if you are using iscsi gateways, as the gateway does not find it's config anymore because it is looking for a config for host.containers.internal

As as workaround we have downgraded podman to 4.0.x

#6 Updated by Frédéric NASS about 1 year ago

Same here. I am able to confirm the accuracy of Patrik's analysis right above.

The workaround consists of either downgrading Podman package to v4.0 (on RHEL8, dnf downgrade podman-4.0.2-6.module+el8.6.0+14877+f643d2d6) or adding the --no-hosts option to "podman run" command in /var/lib/ceph/$(ceph fsid)/iscsi.iscsi.test-iscsi1.xxxxxx/unit.run and restart the iscsi container service, at least for the iscsi part.

#7 Updated by Michael Lipp about 1 year ago

I'm not sure if this is a podman problem or how this relates to podman's behavior.

I'll focus on the grafana aspect. The purpose of `host.containers.internal` is to provide to the code running inside the container the information how the container can be accessed from outside the container. As far as I can see, podman (4.2) behaves correctly. The IP associated with `host.containers.internal` in `/etc/hosts` is the IP that can be used to access the service running in the container.

There are two problems.

  1. The HTML generated uses the value of grafana-api-url to generate links that access the grafana API. The default value `https://host.containers.internal:3000` cannot work under any circumstances because the browser runs outside the container and, of course, `host.containers.internal` cannot be resolved outside the container. What could be done is resolve the `host.containers.internal` and use the resulting IP to generate the links in the HTML.
  1. If the resolved IP was used, there would still be the problem that grafana does not necessarily run on the same host as the manager. I'm far from understanding ceph details. But if I was to generate the HTML, I'd (1) lookup the node that runs grafana and (2) use its IP when generating links. This fails if access from the browser requires some special IP (mapping from some internal address space to some external address space). In this case, grafana-api-url would have to be set by some user provided function whenever the node running grafana changes.

#8 Updated by Ilya Dryomov about 1 year ago

  • Related to Bug #58532: cephadm: "host.containers.internal" iscsi GW entry is created automatically in cephadm iscsi deployments added

#9 Updated by Adam King about 1 year ago

  • Assignee set to Adam King
  • Pull request ID set to 49824

#10 Updated by Adam King about 1 year ago

  • Status changed from New to In Progress

#11 Updated by Adam King 12 months ago

  • Status changed from In Progress to Resolved

going to mark this resolved and then track backports through https://tracker.ceph.com/issues/58532 which required the same fix

Also available in: Atom PDF