Project

General

Profile

Feature #43696

cephadm: check that units start

Added by Sebastian Wagner about 4 years ago. Updated about 3 years ago.

Status:
Rejected
Priority:
Low
Assignee:
-
Category:
cephadm (binary)
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Reviewed:
Affected Versions:
Pull request ID:

Description

When starting a prometheus instance with a bogus config, systemd still initially sees the unit as
LoadState=loaded
ActiveState=active
SubState=running
Only when the prometheus instance itself decides to throw in the towel does the systemd state change to activating and autorestart - and that happens n secs after startup

So perhaps the systemd approach isn't viable? Maybe we could reuse the port_in_use function ?

for systemd, the unit is perfectly active and running as long as podman is running.

( https://developers.redhat.com/blog/2019/04/18/monitoring-container-vitality-and-availability-with-podman/ sounds a bit related. )

History

#1 Updated by Sebastian Wagner about 4 years ago

  • Affected Versions v15.0.0 added

e.g. by scheduling a `daemon ls` run on that host?

#2 Updated by Sebastian Wagner about 4 years ago

I'm inclined to close this as won't fix. What shell we do?

Wait, till the process is responsive?

Like https://github.com/mgfritch/ceph/blob/2d46185c209aa8910d75e75fb56b47e110f0e55e/qa/workunits/cephadm/test_cephadm.sh#L231-L232 ?

That would be really slow and complicated.

Wait x seconds?

Still slow and prone to false-negatives.

Kubernetes

K8s also just leaves them as they are.

#3 Updated by Sebastian Wagner about 4 years ago

  • Tracker changed from Bug to Feature

#4 Updated by Sebastian Wagner almost 4 years ago

  • Priority changed from Normal to Low

low, until someone complains.

#5 Updated by Sebastian Wagner about 3 years ago

  • Status changed from New to Rejected

this would make the daemon deployment of cephadm super slow

Also available in: Atom PDF