Bug #46561: cephadm: monitoring services adoption doesn't honor the container image - Orchestrator - Ceph

Actions

Copy link

Bug #46561

open

cephadm: monitoring services adoption doesn't honor the container image

Added by Dimitri Savineau almost 4 years ago. Updated over 3 years ago.

Status:

New

Priority:

Normal

Assignee:

Category:

cephadm/monitoring

Target version:

% Done:

Source:

Community (dev)

Tags:

Backport:

Regression:

Severity:

2 - major

Reviewed:

Affected Versions:

Ceph - v15.2.4, Ceph - v16.0.0

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

When running `cephadm adopt` command against monitoring services then the container image set via [1] isn't honored compared to an initial deployment.

As an example, prometheus container image was using docker.io/prom/prometheus:v2.7.2 before running the below command

# cephadm adopt --cluster ceph --skip-pull --style legacy --name prometheus.mon0

As a result, prometheus is now using the default prometheus container image value (prom/prometheus:latest which is in fact 2.19.2) from [2]³ and not the one set in the ceph configuration.

Looks like those default values are hardcoded and can't be overrided by the adopt command.

BTW those default values aren't the same than the one from in the cephadm orchestrator backend [4].

[1] `ceph config set mgr mgr/cephadm/container_image_xxxx foo/bar:tag` (where xxxx is either alertmanager, grafana, node_exporter or prometheus)
[2] https://github.com/ceph/ceph/blob/master/src/cephadm/cephadm#L129-L175
[3] https://github.com/ceph/ceph/blob/master/src/cephadm/cephadm#L1809-L1812
[4] https://github.com/ceph/ceph/blob/master/src/pybind/mgr/cephadm/module.py#L183-L202

Related issues 6 (3 open — 3 closed)

Actions

Copy link

Updated by Sebastian Wagner almost 4 years ago

hm. honestly don't know if cephadm adopt has the necessary privileges to access the config store. In any case, we're trying to make cephadm adopt someting that doesn't need to talk to the cluster.

would an environment variable be ok for you?

Actions

Copy link

Updated by Dimitri Savineau almost 4 years ago

via an environment variable like we have for CEPHADM_IMAGE or a dedicated parameter (like --image) both are fine for me.

Actions

Copy link

Updated by Dimitri Savineau almost 4 years ago

I guess it will also help for initial cluster bootstrap.

Because the bootstrap worklfow also use the default values like the adopt command.

As a current workaround, I need to skip the monitoring stack from the bootstrap (--skip-monitoring-stack), then set the monitoring container image variables in the ceph configuration (ceph config set) and finally schedule the monitoring deployment via ceph orch apply.

Actions

Copy link

Updated by Sebastian Wagner over 3 years ago

Dimitri Savineau wrote:

As a current workaround, I need to skip the monitoring stack from the bootstrap (--skip-monitoring-stack)

ceph-salt is always adding --skip-monitoring-stack . The thinking is that it doesn't really make sense to co-locate the monitoring stack with the mon+mgr.

Do you really want to deploy the monitoring stack on the bootstrap host?

Actions

Copy link

Updated by Dimitri Savineau over 3 years ago

The thinking is that it doesn't really make sense to co-locate the monitoring stack with the mon+mgr.

Some people are using this configuration, like OpenStack TripleO

Do you really want to deploy the monitoring stack on the bootstrap host?

Maybe the question is more about why is it enabled by default during the bootstrap ?

Also deploying the monitoring after the bootstrap requires to run an extra ceph command to enable the prometheus mgr module (which is automatically done during the bootstrap) [1]

[1] https://github.com/ceph/ceph/blob/master/src/cephadm/cephadm#L2877-L2879

Actions

Copy link

Updated by Sebastian Wagner over 3 years ago

Dimitri Savineau wrote:

The thinking is that it doesn't really make sense to co-locate the monitoring stack with the mon+mgr.

Some people are using this configuration, like OpenStack TripleO

I mean, if you're really sure this is a good approach, we can make things more configurable during bootstrap. Of course!

Do you really want to deploy the monitoring stack on the bootstrap host?

Maybe the question is more about why is it enabled by default during the bootstrap ?

Also deploying the monitoring after the bootstrap requires to run an extra ceph command to enable the prometheus mgr module (which is automatically done during the bootstrap) [1]

[1] https://github.com/ceph/ceph/blob/master/src/cephadm/cephadm#L2877-L2879

this might be worth an extra tracker issue!

Actions

Copy link

Updated by Nathan Cutler over 3 years ago

If it's not useful to deploy the monitoring stack on a MON+MGR node, why does "cephadm bootstrap" do that?

I guess the answer is that, without the monitoring stack, Dashboard doesn't show any graphs. So if you do just "cephadm bootstrap" and then go into the Dashboard, no graphs are displayed. This gives rise to the question: "how do I get graphs?" and I'm not sure there is a clear answer to this anywhere in the documentation. (And I write that fully hoping to be proven wrong!)

Actions

Copy link

Updated by Nathan Cutler over 3 years ago

Related to Bug #46606: cephadm: post-bootstrap monitoring deployment only works if the command "ceph mgr module enable prometheus" has already been issued added

Actions

Copy link

Updated by Nathan Cutler over 3 years ago

Also deploying the monitoring after the bootstrap requires to run an extra ceph command to enable the prometheus mgr module (which is automatically done during the bootstrap) [1]

[1] https://github.com/ceph/ceph/blob/master/src/cephadm/cephadm#L2877-L2879

this might be worth an extra tracker issue!

Here you go: #46606

Actions

Copy link

#10

Updated by Sebastian Wagner over 3 years ago

Status changed from New to Need More Info

Actions

Copy link

#11

Updated by Dimitri Savineau over 3 years ago

@Sebastien : What information do you need ?

Actions

Copy link

#12

Updated by Sebastian Wagner over 3 years ago

Status changed from Need More Info to New

I think we need to remove the hardcoded default images from cephadm and make them somehow configurable.

Actions

Copy link

#13

Updated by Sebastian Wagner over 3 years ago

Related to Feature #47274: cephadm: make the container_image setting available to the cephadm binary independent of any deployed daemons (maybe ceph.conf?) added

Actions

Copy link

#14

Updated by Sebastian Wagner over 3 years ago

Related to Feature #45111: cephadm: choose distribution specific images based on etc/os-releaes added

Actions

Copy link

#15

Updated by Sebastian Wagner about 3 years ago

Related to Bug #45973: Adopted MDS daemons are removed by the orchestrator because they're orphans added

Actions

Copy link

#16

Updated by Sebastian Wagner almost 3 years ago

Related to Bug #50502: cephadm pull doesn't get latest image added

Actions

Copy link

#17

Updated by Sebastian Wagner almost 3 years ago

Related to deleted (Bug #46606: cephadm: post-bootstrap monitoring deployment only works if the command "ceph mgr module enable prometheus" has already been issued)

Actions

Copy link

#18

Updated by Sebastian Wagner almost 3 years ago

Related to Feature #45996: adopted prometheus instance uses port 9095, regardless of original port number added

Actions

Copy link

#19

Updated by Sebastian Wagner over 2 years ago

Related to Documentation #52797: [cephadm]use mirrors for service images (grafana, prometheus ...) added

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » Orchestrator

Custom queries

Bug #46561

cephadm: monitoring services adoption doesn't honor the container image

Updated by Sebastian Wagner almost 4 years ago

Updated by Dimitri Savineau almost 4 years ago

Updated by Dimitri Savineau almost 4 years ago

Updated by Sebastian Wagner over 3 years ago

Updated by Dimitri Savineau over 3 years ago

Updated by Sebastian Wagner over 3 years ago

Updated by Nathan Cutler over 3 years ago

Updated by Nathan Cutler over 3 years ago

Updated by Nathan Cutler over 3 years ago

Updated by Sebastian Wagner over 3 years ago

Updated by Dimitri Savineau over 3 years ago

Updated by Sebastian Wagner over 3 years ago

Updated by Sebastian Wagner over 3 years ago

Updated by Sebastian Wagner over 3 years ago

Updated by Sebastian Wagner about 3 years ago

Updated by Sebastian Wagner almost 3 years ago

Updated by Sebastian Wagner almost 3 years ago

Updated by Sebastian Wagner almost 3 years ago

Updated by Sebastian Wagner over 2 years ago