Bug #46606
cephadm: post-bootstrap monitoring deployment only works if the command "ceph mgr module enable prometheus" has already been issued
0%
Description
Post-bootstrap monitoring deployment only works if the command "ceph mgr module enable prometheus" has already been issued
Reported by Dmitri Savineau here: https://tracker.ceph.com/issues/46561#note-5
deploying the monitoring after the bootstrap requires to run an extra ceph command to enable the prometheus mgr module (which is automatically done during the bootstrap) [1]
[1] https://github.com/ceph/ceph/blob/master/src/cephadm/cephadm#L2877-L2879
History
#1 Updated by Nathan Cutler over 3 years ago
- Related to Bug #46561: cephadm: monitoring services adoption doesn't honor the container image added
#2 Updated by Nathan Cutler over 3 years ago
- Subject changed from Post-bootstrap monitoring deployment only works if the command "ceph mgr module enable prometheus" has already been issued to cephadm: post-bootstrap monitoring deployment only works if the command "ceph mgr module enable prometheus" has already been issued
#3 Updated by Nathan Cutler over 3 years ago
- Description updated (diff)
#4 Updated by Nathan Cutler over 3 years ago
- Description updated (diff)
#5 Updated by Sebastian Wagner over 3 years ago
- Category set to cephadm/monitoring
#6 Updated by Sebastian Wagner about 3 years ago
- Priority changed from Normal to High
#7 Updated by Juan Miguel Olmo MartÃnez about 3 years ago
- Assignee set to Sebastian Wagner
#8 Updated by Sebastian Wagner about 3 years ago
- Status changed from New to Fix Under Review
- Pull request ID set to 39520
#9 Updated by Sebastian Wagner almost 3 years ago
- Status changed from Fix Under Review to New
#10 Updated by Sebastian Wagner almost 3 years ago
- Related to deleted (Bug #46561: cephadm: monitoring services adoption doesn't honor the container image)
#11 Updated by Sage Weil almost 3 years ago
A couple options:
- make the 'orch apply prometheus' fail if the mgr prometheus module isn't enabled. (maybe include a --force in case the user really wants to proceed?)
- make cephadm raise a health warning if there is a prometheus deployed but the prometheus module isn't enabled
- make 'orch apply prometheus' silently enable the prometheus module
#12 Updated by Sebastian Wagner almost 3 years ago
I'd definitively go for make 'orch apply prometheus' silently enable the prometheus module.
#13 Updated by Nathan Cutler almost 3 years ago
- make the 'orch apply prometheus' fail if the mgr prometheus module isn't enabled. (maybe include a --force in case the user really wants to proceed?)
This one is slightly problematic because there is not just "orch apply prometheus" with a prometheus-specific yaml blob, but also "orch apply" with a BIG yaml blob (with sections for various kinds of services/daemons).
Arguably, the "orch apply" command (with BIG yaml blob) should fail if any part of the yaml is not fulfillable. But that's not how the orchestrator works: the "orch apply" is fulfilled as a background task and when something goes wrong it's not always obvious to the user how to figure out what happened and why, since it typically involves conducting a post-mortem examination of the mgr logs.
To say it another way: "orch apply" is like a "moon shot". Everything has to be prepared in advance. Once the rocket is on its way up, there isn't any good way of aborting the mission.
(Caveat: this is just my impression as a casual user of "orch apply", not based on any deep knowledge of the code or even the design)
#14 Updated by Sebastian Wagner almost 3 years ago
- Priority changed from High to Normal
prio=normal, as this is not trivial to implement
#15 Updated by Sebastian Wagner over 2 years ago
- Assignee deleted (
Sebastian Wagner)
#16 Updated by Sebastian Wagner over 2 years ago
- Status changed from New to Resolved
- Pull request ID changed from 39520 to 42682
PR 42682