Bug #48870
cephadm: Several services in error status after upgrade to 15.2.8: unrecognized arguments: --filter-for-batch
0%
Description
I initiated the upgrade to the latest version of Ceph from 15.2.5 (Ubuntu 20.04.1, Podman 2.1.1). At first everything was working fine until "ceph log last cephadm log" reported several errors which all were similar to the following one:
2021-01-13T12:03:17.545094+0000 mgr.iz-ceph-v1-mon-03.ncnoal (mgr.19061289) 534 : cephadm [ERR] cephadm exited with an error code: 1,
stderr:/usr/bin/podman:stderr usage: ceph-volume inventory [-h] [--format {plain,json,json-pretty}] [path]/usr/bin/podman:stderr ceph-volume inventory: error: unrecognized arguments: --filter-for-batch
Traceback (most recent call last):
File "<stdin>", line 6112, in <module>
File "<stdin>", line 1299, in _infer_fsid
File "<stdin>", line 1382, in _infer_image
File "<stdin>", line 3612, in command_ceph_volume
File "<stdin>", line 1061, in call_throws
RuntimeError: Failed command: /usr/bin/podman run --rm --ipc=host --net=host --entrypoint /usr/sbin/ceph-volume --privileged --group-add=disk -e CONTAINER_IMAGE=docker.io/ceph/ceph:v15.2.5 -e NODE_NAME=iz-ceph-v1-mon-03 -v /var/run/ceph/68317c90-b44f-11ea-a0c4-d1443a31407c:/var/run/ceph:z -v /var/log/ceph/68317c90-b44f-11ea-a0c4-d1443a31407c:/var/log/ceph:z -v /var/lib/ceph/68317c90-b44f-11ea-a0c4-d1443a31407c/crash:/var/lib/ceph/crash:z -v /dev:/dev -v /run/udev:/run/udev -v /sys:/sys -v /run/lvm:/run/lvm -v /run/lock/lvm:/run/lock/lvm docker.io/ceph/ceph:v15.2.5 inventory --format=json --filter-for-batch
Traceback (most recent call last):
File "/usr/share/ceph/mgr/cephadm/module.py", line 1012, in _remote_connection
yield (conn, connr)
File "/usr/share/ceph/mgr/cephadm/module.py", line 1156, in _run_cephadm
code, '\n'.join(err)))
orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:/usr/bin/podman:stderr usage: ceph-volume inventory [-h] [--format {plain,json,json-pretty}] [path]/usr/bin/podman:stderr ceph-volume inventory: error: unrecognized arguments: --filter-for-batch
Traceback (most recent call last):
File "<stdin>", line 6112, in <module>
File "<stdin>", line 1299, in _infer_fsid
File "<stdin>", line 1382, in _infer_image
File "<stdin>", line 3612, in command_ceph_volume
File "<stdin>", line 1061, in call_throws
The upgrade nevertheless seemed to finish and all services were reported by ceph orch ps as running in v15.2.8
Because of the errors during the upgrade I assumed a small inconsistency with podman/ubuntu so I updated all systems via apt to the latest versions and rebooted them, resulting in Podman updated to version 2.2.1.
Since then ceph orch ps reports several services in an error status:
alertmanager.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 error 4m ago 6M 0.21.0 docker.io/prom/alertmanager:latest c876f5897d7b 9a771c886b09
crash.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 93d9a3267a2d
crash.iz-ceph-v1-mon-02 iz-ceph-v1-mon-02 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 574787725a46
crash.iz-ceph-v1-mon-03 iz-ceph-v1-mon-03 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 7f94569f5195
crash.iz-ceph-v1-osd-01 iz-ceph-v1-osd-01 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 5182be56dc6b
crash.iz-ceph-v1-osd-02 iz-ceph-v1-osd-02 running (17h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 9e23c565fc64
crash.iz-ceph-v1-osd-03 iz-ceph-v1-osd-03 running (17h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 12ea957024f8
grafana.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 error 4m ago 6M 6.6.2 docker.io/ceph/ceph-grafana:latest 87a51ecf0b1c 43c6d192a4dd
mds.cephfs.iz-ceph-v1-mon-01.vjehnz iz-ceph-v1-mon-01 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 5ad9c3fae67d
mds.cephfs.iz-ceph-v1-mon-02.cbdqzm iz-ceph-v1-mon-02 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 5e732eb43b59
mds.cephfs.iz-ceph-v1-mon-03.zpxfkg iz-ceph-v1-mon-03 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 71ed07a47be2
mgr.iz-ceph-v1-mon-01.elswai iz-ceph-v1-mon-01 error 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 567b649f7825
mgr.iz-ceph-v1-mon-02.foqmfa iz-ceph-v1-mon-02 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 5b2014ba51da
mgr.iz-ceph-v1-mon-03.ncnoal iz-ceph-v1-mon-03 error 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 115effface96
mon.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 7fc683227cee
mon.iz-ceph-v1-mon-02 iz-ceph-v1-mon-02 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 01951e4b5a5e
mon.iz-ceph-v1-mon-03 iz-ceph-v1-mon-03 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c ed2e540108f2
node-exporter.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 error 4m ago 6M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 f8d1e263f5bd
node-exporter.iz-ceph-v1-mon-02 iz-ceph-v1-mon-02 error 4m ago 6M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 deb31c0e96fe
node-exporter.iz-ceph-v1-mon-03 iz-ceph-v1-mon-03 error 4m ago 6M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 803ad1ead652
node-exporter.iz-ceph-v1-osd-01 iz-ceph-v1-osd-01 error 4m ago 6M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 31a86adea0da
node-exporter.iz-ceph-v1-osd-02 iz-ceph-v1-osd-02 error 4m ago 6M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 98aba976cb23
node-exporter.iz-ceph-v1-osd-03 iz-ceph-v1-osd-03 error 4m ago 6M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 121aed124f3b
osd.0 iz-ceph-v1-osd-01 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 19b8ae70354c
osd.1 iz-ceph-v1-osd-01 running (16h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 2569a941b756
osd.2 iz-ceph-v1-osd-02 running (17h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 4e0a96f76663
osd.3 iz-ceph-v1-osd-02 running (17h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 7256225f6fd7
osd.4 iz-ceph-v1-osd-03 running (17h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c f86ecc759cee
osd.5 iz-ceph-v1-osd-03 running (17h) 4m ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 8da57f4da01f
prometheus.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 error 4m ago 6M 2.19.1 docker.io/prom/prometheus:latest 396dc3b4e717 2a0a8f4c368c
Here are some journalctl log entries from the boot process on the first monitor node which indicate that the containers couldn't be started properly:
Jan 13 14:29:04 iz-ceph-v1-mon-01 bash[1547]: 127.0.0.1 - - [13/Jan/2021:13:29:04] "GET /metrics HTTP/1.1" 200 - "" "Prometheus/2.19.1"
Jan 13 14:29:05 iz-ceph-v1-mon-01 systemd[1]: Started libpod-conmon-7fc683227cee875c70df5a5bb56d85b42311e37f4faa67d0b9ccf0b870a18794.scope.
Jan 13 14:29:05 iz-ceph-v1-mon-01 systemd[1]: run-runc-7fc683227cee875c70df5a5bb56d85b42311e37f4faa67d0b9ccf0b870a18794-runc.VTU3WA.mount: Succeeded.
Jan 13 14:29:05 iz-ceph-v1-mon-01 podman[3236]: 2021-01-13 14:29:05.518057669 +0100 CET m=+0.199315429 container exec 7fc683227cee875c70df5a5bb56d85b42311e37f4faa67d0b9ccf0b870a18794 (image=docker.io/ceph/ceph:v15.2.8, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mon.iz-ceph-v1-mon-01, org.label-schema.vendor=CentOS, RELEASE=HEAD, CEPH_POINT_RELEASE=-15.2.8, GIT_BRANCH=HEAD, GIT_CLEAN=True, org.label-schema.build-date=20201204, org.label-schema.license=GPLv2, org.label-schema.schema-version=1.0, GIT_COMMIT=4289d4d722f77bb5ddad5e5f98141a2a1c21a48f, GIT_REPO=https://github.com/ceph/ceph-container.git, ceph=True, maintainer=Dimitri Savineau <dsavinea@redhat.com>, org.label-schema.name=CentOS Base Image)
Jan 13 14:29:05 iz-ceph-v1-mon-01 systemd[1]: libpod-conmon-7fc683227cee875c70df5a5bb56d85b42311e37f4faa67d0b9ccf0b870a18794.scope: Succeeded.
Jan 13 14:29:06 iz-ceph-v1-mon-01 systemd[1]: Started libpod-conmon-43c6d192a4dd2f15b2ccf5f13696f9d1801c48b79b3751259275133bf68f052f.scope.
Jan 13 14:29:06 iz-ceph-v1-mon-01 systemd[1]: run-runc-43c6d192a4dd2f15b2ccf5f13696f9d1801c48b79b3751259275133bf68f052f-runc.CQ54pQ.mount: Succeeded.
Jan 13 14:29:06 iz-ceph-v1-mon-01 systemd[3042]: run-runc-43c6d192a4dd2f15b2ccf5f13696f9d1801c48b79b3751259275133bf68f052f-runc.CQ54pQ.mount: Succeeded.
Jan 13 14:29:06 iz-ceph-v1-mon-01 systemd[2074]: run-runc-43c6d192a4dd2f15b2ccf5f13696f9d1801c48b79b3751259275133bf68f052f-runc.CQ54pQ.mount: Succeeded.
Jan 13 14:29:06 iz-ceph-v1-mon-01 podman[3318]: 2021-01-13 14:29:06.21613319 +0100 CET m=+0.117582684 container exec 43c6d192a4dd2f15b2ccf5f13696f9d1801c48b79b3751259275133bf68f052f (image=docker.io/ceph/ceph-grafana:latest, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-grafana.iz-ceph-v1-mon-01, io.buildah.version=1.14.2, org.label-schema.name=CentOS Base Image, org.label-schema.schema-version=1.0, org.opencontainers.image.vendor=CentOS, description=Ceph Grafana Container, org.label-schema.vendor=CentOS, summary=Grafana Container configured for Ceph mgr/dashboard integration, org.label-schema.license=GPLv2, org.label-schema.vendor=CentOS, summary=Grafana Container configured for Ceph mgr/dashboard integration, org.label-schema.license=GPLv2, org.label-schema.build-date=20200114, org.opencontainers.image.created=2020-01-14 00:00:00-08:00, org.opencontainers.image.title=CentOS Base Image, maintainer=Paul Cuzner <pcuzner@redhat.com>, org.opencontainers.image.licenses=GPL-2.0-only)
Jan 13 14:29:06 iz-ceph-v1-mon-01 systemd[1]: libpod-conmon-43c6d192a4dd2f15b2ccf5f13696f9d1801c48b79b3751259275133bf68f052f.scope: Succeeded.
Jan 13 14:29:06 iz-ceph-v1-mon-01 systemd[1]: Started libpod-conmon-2a0a8f4c368ce4ee486c0a281ccc80c1df1101486c621a30c8ec2c9ea7e39bda.scope.
Jan 13 14:29:06 iz-ceph-v1-mon-01 systemd[1]: run-runc-2a0a8f4c368ce4ee486c0a281ccc80c1df1101486c621a30c8ec2c9ea7e39bda-runc.AzgeLg.mount: Succeeded.
Jan 13 14:29:06 iz-ceph-v1-mon-01 systemd[2074]: run-runc-2a0a8f4c368ce4ee486c0a281ccc80c1df1101486c621a30c8ec2c9ea7e39bda-runc.AzgeLg.mount: Succeeded.
Jan 13 14:29:06 iz-ceph-v1-mon-01 systemd[3042]: run-runc-2a0a8f4c368ce4ee486c0a281ccc80c1df1101486c621a30c8ec2c9ea7e39bda-runc.AzgeLg.mount: Succeeded.
Jan 13 14:29:06 iz-ceph-v1-mon-01 podman[3403]: 2021-01-13 14:29:06.557505103 +0100 CET m=+0.142413228 container exec 2a0a8f4c368ce4ee486c0a281ccc80c1df1101486c621a30c8ec2c9ea7e39bda (image=docker.io/prom/prometheus:latest, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-prometheus.iz-ceph-v1-mon-01, maintainer=The Prometheus Authors <prometheus-developers@googlegroups.com>)
Jan 13 14:29:06 iz-ceph-v1-mon-01 systemd[1]: libpod-conmon-2a0a8f4c368ce4ee486c0a281ccc80c1df1101486c621a30c8ec2c9ea7e39bda.scope: Succeeded.
Jan 13 14:29:07 iz-ceph-v1-mon-01 systemd[1]: Started libpod-conmon-9a771c886b0903e3db045d41d21e7c0df9c126ac3fdd797cf8b60142d4f899c7.scope.
Jan 13 14:29:07 iz-ceph-v1-mon-01 systemd[1]: run-runc-9a771c886b0903e3db045d41d21e7c0df9c126ac3fdd797cf8b60142d4f899c7-runc.ae5tiF.mount: Succeeded.
Jan 13 14:29:07 iz-ceph-v1-mon-01 systemd[3042]: run-runc-9a771c886b0903e3db045d41d21e7c0df9c126ac3fdd797cf8b60142d4f899c7-runc.ae5tiF.mount: Succeeded.
Jan 13 14:29:07 iz-ceph-v1-mon-01 podman[3513]: 2021-01-13 14:29:07.188433234 +0100 CET m=+0.181371954 container exec 9a771c886b0903e3db045d41d21e7c0df9c126ac3fdd797cf8b60142d4f899c7 (image=docker.io/prom/alertmanager:latest, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-alertmanager.iz-ceph-v1-mon-01, maintainer=The Prometheus Authors <prometheus-developers@googlegroups.com>)
Jan 13 14:29:07 iz-ceph-v1-mon-01 systemd[1]: libpod-conmon-9a771c886b0903e3db045d41d21e7c0df9c126ac3fdd797cf8b60142d4f899c7.scope: Succeeded.
Jan 13 14:29:07 iz-ceph-v1-mon-01 systemd[1]: Started libpod-conmon-f8d1e263f5bda2a15fa6b609b11a0b8a8dfb811ea7880db86d4e9a5ba4c62037.scope.
Jan 13 14:29:07 iz-ceph-v1-mon-01 systemd[1]: run-runc-f8d1e263f5bda2a15fa6b609b11a0b8a8dfb811ea7880db86d4e9a5ba4c62037-runc.WjQpx7.mount: Succeeded.
Jan 13 14:29:07 iz-ceph-v1-mon-01 systemd[2074]: run-runc-f8d1e263f5bda2a15fa6b609b11a0b8a8dfb811ea7880db86d4e9a5ba4c62037-runc.WjQpx7.mount: Succeeded.
Jan 13 14:29:07 iz-ceph-v1-mon-01 systemd[3042]: run-runc-f8d1e263f5bda2a15fa6b609b11a0b8a8dfb811ea7880db86d4e9a5ba4c62037-runc.WjQpx7.mount: Succeeded.
Jan 13 14:29:07 iz-ceph-v1-mon-01 podman[3599]: 2021-01-13 14:29:07.526560736 +0100 CET m=+0.144490800 container exec f8d1e263f5bda2a15fa6b609b11a0b8a8dfb811ea7880db86d4e9a5ba4c62037 (image=docker.io/prom/node-exporter:latest, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-01, maintainer=The Prometheus Authors <prometheus-developers@googlegroups.com>)
Jan 13 14:29:07 iz-ceph-v1-mon-01 systemd[1]: libpod-conmon-f8d1e263f5bda2a15fa6b609b11a0b8a8dfb811ea7880db86d4e9a5ba4c62037.scope: Succeeded.
[...]
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: start operation timed out. Terminating.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: start operation timed out. Terminating.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: start operation timed out. Terminating.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: start operation timed out. Terminating.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@mgr.iz-ceph-v1-mon-01.elswai.service: start operation timed out. Terminating.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Failed with result 'timeout'.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: Failed to start Ceph node-exporter.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Failed with result 'timeout'.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: Failed to start Ceph grafana.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Failed with result 'timeout'.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: Failed to start Ceph prometheus.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Failed with result 'timeout'.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: Failed to start Ceph alertmanager.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@mgr.iz-ceph-v1-mon-01.elswai.service: Failed with result 'timeout'.
Jan 13 14:29:51 iz-ceph-v1-mon-01 systemd[1]: Failed to start Ceph mgr.iz-ceph-v1-mon-01.elswai for 68317c90-b44f-11ea-a0c4-d1443a31407c.
[...]
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Scheduled restart job, restart counter is at 1.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Scheduled restart job, restart counter is at 1.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Scheduled restart job, restart counter is at 1.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@mgr.iz-ceph-v1-mon-01.elswai.service: Scheduled restart job, restart counter is at 1.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Scheduled restart job, restart counter is at 1.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: Stopped Ceph alertmanager.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Found left-over process 1238 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Found left-over process 1272 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Found left-over process 1606 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: Starting Ceph alertmanager.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c...
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: Stopped Ceph grafana.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Found left-over process 1237 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Found left-over process 1279 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Found left-over process 1608 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: Starting Ceph grafana.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c...
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: Stopped Ceph mgr.iz-ceph-v1-mon-01.elswai for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@mgr.iz-ceph-v1-mon-01.elswai.service: Found left-over process 1234 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@mgr.iz-ceph-v1-mon-01.elswai.service: Found left-over process 1547 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@mgr.iz-ceph-v1-mon-01.elswai.service: Found left-over process 1734 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: Starting Ceph mgr.iz-ceph-v1-mon-01.elswai for 68317c90-b44f-11ea-a0c4-d1443a31407c...
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: Stopped Ceph node-exporter.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Found left-over process 1222 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Found left-over process 1223 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Found left-over process 1587 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: Starting Ceph node-exporter.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c...
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: Stopped Ceph prometheus.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Found left-over process 1240 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Found left-over process 1265 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Found left-over process 1599 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: Starting Ceph prometheus.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c...
Jan 13 14:30:01 iz-ceph-v1-mon-01 podman[4951]: Error: cannot remove container 3775ce5793baab123898cce1c5ebd979b934ef41d443300b3ee2cf535e034e24 as it is running - running or paused containers cannot be removed without force: container state improper
Jan 13 14:30:01 iz-ceph-v1-mon-01 podman[4949]: Error: cannot remove container 9a771c886b0903e3db045d41d21e7c0df9c126ac3fdd797cf8b60142d4f899c7 as it is running - running or paused containers cannot be removed without force: container state improper
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Found left-over process 1238 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Found left-over process 1272 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Found left-over process 1606 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@mgr.iz-ceph-v1-mon-01.elswai.service: Found left-over process 1234 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@mgr.iz-ceph-v1-mon-01.elswai.service: Found left-over process 1547 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@mgr.iz-ceph-v1-mon-01.elswai.service: Found left-over process 1734 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 podman[4950]: Error: cannot remove container 43c6d192a4dd2f15b2ccf5f13696f9d1801c48b79b3751259275133bf68f052f as it is running - running or paused containers cannot be removed without force: container state improper
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Found left-over process 1237 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Found left-over process 1279 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Found left-over process 1608 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 podman[4952]: Error: cannot remove container f8d1e263f5bda2a15fa6b609b11a0b8a8dfb811ea7880db86d4e9a5ba4c62037 as it is running - running or paused containers cannot be removed without force: container state improper
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Found left-over process 1222 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Found left-over process 1223 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Found left-over process 1587 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 podman[4953]: Error: cannot remove container 2a0a8f4c368ce4ee486c0a281ccc80c1df1101486c621a30c8ec2c9ea7e39bda as it is running - running or paused containers cannot be removed without force: container state improper
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Found left-over process 1240 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Found left-over process 1265 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Found left-over process 1599 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:01 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:02 iz-ceph-v1-mon-01 bash[5072]: Error: error creating container storage: the container name "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-alertmanager.iz-ceph-v1-mon-01" is already in use by "9a771c886b0903e3db045d41d21e7c0df9c126ac3fdd797cf8b60142d4f899c7". You have to remove that container to be to reuse that name.: that name is already in use
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Control process exited, code=exited, status=125/n/a
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Failed with result 'exit-code'.
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: Failed to start Ceph alertmanager.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:02 iz-ceph-v1-mon-01 bash[5078]: Error: error creating container storage: the container name "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-grafana.iz-ceph-v1-mon-01" is already in use by "43c6d192a4dd2f15b2ccf5f13696f9d1801c48b79b3751259275133bf68f052f". You have to remove that container to be able to reuse that name.: that name is already in use
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Control process exited, code=exited, status=125/n/a
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Failed with result 'exit-code'.
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: Failed to start Ceph grafana.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:02 iz-ceph-v1-mon-01 bash[5084]: Error: error creating container storage: the container name "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-01" is already in use by "f8d1e263f5bda2a15fa6b609b11a0b8a8dfb811ea7880db86d4e9a5ba4c62037". You have to remove that container to be able to reuse that name.: that name is already in use
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Control process exited, code=exited, status=125/n/a
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Failed with result 'exit-code'.
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: Failed to start Ceph node-exporter.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:02 iz-ceph-v1-mon-01 bash[5099]: Error: error creating container storage: the container name "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-prometheus.iz-ceph-v1-mon-01" is already in use by "2a0a8f4c368ce4ee486c0a281ccc80c1df1101486c621a30c8ec2c9ea7e39bda". You have to remove that container to be able to reuse that name.: that name is already in use
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Control process exited, code=exited, status=125/n/a
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Failed with result 'exit-code'.
Jan 13 14:30:02 iz-ceph-v1-mon-01 systemd[1]: Failed to start Ceph prometheus.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
[...]
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Scheduled restart job, restart counter is at 2.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Scheduled restart job, restart counter is at 2.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: Stopped Ceph alertmanager.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Found left-over process 1238 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Found left-over process 1272 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@alertmanager.iz-ceph-v1-mon-01.service: Found left-over process 1606 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: Starting Ceph alertmanager.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c...
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: Stopped Ceph grafana.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Found left-over process 1237 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Found left-over process 1279 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@grafana.iz-ceph-v1-mon-01.service: Found left-over process 1608 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: Starting Ceph grafana.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c...
Jan 13 14:30:12 iz-ceph-v1-mon-01 podman[5077]: 2021-01-13 14:30:12.455597737 +0100 CET m=+10.678191759 container stop 3775ce5793baab123898cce1c5ebd979b934ef41d443300b3ee2cf535e034e24 (image=docker.io/ceph/ceph:v15.2.8, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mgr.iz-ceph-v1-mon-01.elswai, org.label-schema.ve>
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Scheduled restart job, restart counter is at 2.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Scheduled restart job, restart counter is at 2.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: Stopped Ceph node-exporter.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Found left-over process 1222 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Found left-over process 1223 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-01.service: Found left-over process 1587 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: Starting Ceph node-exporter.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c...
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: Stopped Ceph prometheus.iz-ceph-v1-mon-01 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Found left-over process 1240 (bash) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Found left-over process 1265 (podman) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@prometheus.iz-ceph-v1-mon-01.service: Found left-over process 1599 (conmon) in control group while starting unit. Ignoring.
Jan 13 14:30:12 iz-ceph-v1-mon-01 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
[...]
Related issues
History
#1 Updated by Sebastian Wagner about 3 years ago
- Project changed from Ceph to Orchestrator
#2 Updated by Sebastian Wagner about 3 years ago
- Related to Bug #48694: ceph-volume: unrecognized arguments: --filter-for-batch added
#3 Updated by Sebastian Wagner about 3 years ago
- Subject changed from cephadm: Several services in error status after upgrade to 15.2.8 due to failed container start operations to cephadm: Several services in error status after upgrade to 15.2.8: unrecognized arguments: --filter-for-batch
#4 Updated by Sebastian Wagner about 3 years ago
- Priority changed from Normal to Urgent
#5 Updated by Sebastian Wagner about 3 years ago
ok: everything was added in https://github.com/ceph/ceph/pull/37520
- the --filter-for-batch implementation in ceph-volume
- the --filter-for-batch usage in cephadm
but the error message contains:
usage: ceph-volume inventory [-h] [--format {plain,json,json-pretty}] [path]
which is part of https://docs.ceph.com/en/latest/releases/octopus/#changelog
ok, this is an internal cephadm upgrade problem:
step 1: user calls `ceph orch upgrade`
step 2: mgr/cephadm calls ceph orch config set mgr.x container_image <new-container>
step 3: mgr gets redeployed
step 4: mgr failover
step 5: mgr/cephadm calls _refresh_host_devices
step 6: _refresh_host_devices calls ceph orch config get osd.1 container_image. But this returns the old image
step 7: _refresh_host_devices calls ceph-volume ... --filter-for-batch
and this breaks the upgrade, as [global] contaienr_image still points to the old image, but the code is running on the new image already
#6 Updated by Sebastian Wagner about 3 years ago
possible solution: https://github.com/ceph/ceph/pull/38926
#7 Updated by Sebastian Wagner about 3 years ago
- Status changed from New to In Progress
- Assignee set to Sebastian Wagner
#8 Updated by Sebastian Wagner about 3 years ago
Ok, I see two distinct issues here:
1. --filter-for-batch which is addressed in https://github.com/ceph/ceph/pull/38926
2. several daemons are not coming back up due to "Found left-over process 1606 (conmon) in control group while starting unit"
for the second issue, as a workaround, please restart the host.
#9 Updated by Sebastian Wagner about 3 years ago
- Priority changed from Urgent to Normal
#10 Updated by Gunther Heinrich about 3 years ago
I restarted all hosts but that didn't solve the problem unfortunately:
alertmanager.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 error 36s ago 7M 0.21.0 docker.io/prom/alertmanager:latest c876f5897d7b 6cf2571fc92a
crash.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 running (19m) 36s ago 7M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 4537f1ff97c1
crash.iz-ceph-v1-mon-02 iz-ceph-v1-mon-02 running (18m) 40s ago 7M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c b409fd7a374b
crash.iz-ceph-v1-mon-03 iz-ceph-v1-mon-03 running (17m) 40s ago 7M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 9fc17a3ec4f3
crash.iz-ceph-v1-osd-01 iz-ceph-v1-osd-01 running (15m) 9s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 39d9bf41bb7e
crash.iz-ceph-v1-osd-02 iz-ceph-v1-osd-02 running (14m) 9s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 58589908b68e
crash.iz-ceph-v1-osd-03 iz-ceph-v1-osd-03 running (12m) 8s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c b81bb6acd673
grafana.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 error 36s ago 7M 6.6.2 docker.io/ceph/ceph-grafana:latest 87a51ecf0b1c 625ff3e2633f
mds.cephfs.iz-ceph-v1-mon-01.vjehnz iz-ceph-v1-mon-01 running (19m) 36s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c db1e1c821e1d
mds.cephfs.iz-ceph-v1-mon-02.cbdqzm iz-ceph-v1-mon-02 running (18m) 40s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 8bdea20df510
mds.cephfs.iz-ceph-v1-mon-03.zpxfkg iz-ceph-v1-mon-03 running (17m) 40s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c f96147b8aaee
mgr.iz-ceph-v1-mon-01.elswai iz-ceph-v1-mon-01 error 36s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 53d23c82303a
mgr.iz-ceph-v1-mon-02.foqmfa iz-ceph-v1-mon-02 running (18m) 40s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 3c03159e79b9
mgr.iz-ceph-v1-mon-03.ncnoal iz-ceph-v1-mon-03 error 40s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c c7d745c7cacc
mon.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 running (19m) 36s ago 7M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 2f93bf5fcddb
mon.iz-ceph-v1-mon-02 iz-ceph-v1-mon-02 running (18m) 40s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c c5a891d63f6a
mon.iz-ceph-v1-mon-03 iz-ceph-v1-mon-03 running (17m) 40s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 3dc86765e867
node-exporter.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 error 36s ago 7M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 fbd216acfdf2
node-exporter.iz-ceph-v1-mon-02 iz-ceph-v1-mon-02 error 40s ago 7M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 2e1cdf4c3564
node-exporter.iz-ceph-v1-mon-03 iz-ceph-v1-mon-03 error 40s ago 7M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 b156aeb256fa
node-exporter.iz-ceph-v1-osd-01 iz-ceph-v1-osd-01 error 9s ago 6M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 98c3eea27b58
node-exporter.iz-ceph-v1-osd-02 iz-ceph-v1-osd-02 error 9s ago 6M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 64292a5661c6
node-exporter.iz-ceph-v1-osd-03 iz-ceph-v1-osd-03 error 8s ago 6M 1.0.1 docker.io/prom/node-exporter:latest 0e0218889c33 3ac07baa7d87
osd.0 iz-ceph-v1-osd-01 running (15m) 9s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c ab44fe2809d1
osd.1 iz-ceph-v1-osd-01 running (15m) 9s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 51b06ee0e70a
osd.2 iz-ceph-v1-osd-02 running (14m) 9s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 18fecaea7f46
osd.3 iz-ceph-v1-osd-02 running (14m) 9s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c 9243e4ea8ea5
osd.4 iz-ceph-v1-osd-03 running (12m) 8s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c c48ab6d881dc
osd.5 iz-ceph-v1-osd-03 running (12m) 8s ago 6M 15.2.8 docker.io/ceph/ceph:v15.2.8 5553b0cb212c fa157d28af7a
prometheus.iz-ceph-v1-mon-01 iz-ceph-v1-mon-01 error 36s ago 7M 2.19.1 docker.io/prom/prometheus:latest 396dc3b4e717 70aab14fb0c0
In the journalctl-logs I found these entries (this time I took them from the second mon) which seems to indicate, that during startup some (old?) containers cannot be found anymore and thus cannot be evicted? Here are the log entries with some later entries added for "2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca":
Jan 18 07:17:30 iz-ceph-v1-mon-02 podman[602]: Error: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-crash.iz-ceph-v1-mon-02 found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 podman[606]: Error: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02 found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 podman[605]: Error: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mon.iz-ceph-v1-mon-02 found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 podman[604]: Error: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mgr.iz-ceph-v1-mon-02.foqmfa found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 podman[603]: Error: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mds.cephfs.iz-ceph-v1-mon-02.cbdqzm found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 bash[1154]: Error: failed to evict container: "": failed to find container "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mgr.iz-ceph-v1-mon-02.foqmfa" in state: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mgr.iz-ceph-v1-mon-02.foqmfa found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 bash[1167]: Error: failed to evict container: "": failed to find container "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-crash.iz-ceph-v1-mon-02" in state: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-crash.iz-ceph-v1-mon-02 found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 bash[1163]: Error: failed to evict container: "": failed to find container "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mon.iz-ceph-v1-mon-02" in state: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mon.iz-ceph-v1-mon-02 found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 bash[1166]: Error: failed to evict container: "": failed to find container "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mds.cephfs.iz-ceph-v1-mon-02.cbdqzm" in state: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mds.cephfs.iz-ceph-v1-mon-02.cbdqzm found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 bash[1206]: Error: failed to evict container: "": failed to find container "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mgr.iz-ceph-v1-mon-02.foqmfa" in state: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mgr.iz-ceph-v1-mon-02.foqmfa found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 bash[1278]: Error: failed to evict container: "": failed to find container "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-crash.iz-ceph-v1-mon-02" in state: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-crash.iz-ceph-v1-mon-02 found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 bash[1283]: Error: failed to evict container: "": failed to find container "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mon.iz-ceph-v1-mon-02" in state: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mon.iz-ceph-v1-mon-02 found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 bash[1282]: Error: failed to evict container: "": failed to find container "ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mds.cephfs.iz-ceph-v1-mon-02.cbdqzm" in state: no container with name or ID ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mds.cephfs.iz-ceph-v1-mon-02.cbdqzm found: no such container
Jan 18 07:17:30 iz-ceph-v1-mon-02 podman[1159]: 2021-01-18 07:17:30.704563369 +0100 CET m=+0.523688984 container create 2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca (image=docker.io/prom/node-exporter:latest, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02, maintai[...]
Jan 18 07:17:31 iz-ceph-v1-mon-02 podman[1313]: 2021-01-18 07:17:31.113839747 +0100 CET m=+0.518400545 container create 3c03159e79b9be1257c3642b7c9a6c42202cd1b657ec54bd1cf1d1cc2340aaa9 (image=docker.io/ceph/ceph:v15.2.8, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mgr.iz-ceph-v1-mon-02.foqmfa, maintainer=Dimitri[...]
Jan 18 07:17:31 iz-ceph-v1-mon-02 podman[1368]: 2021-01-18 07:17:31.15929247 +0100 CET m=+0.484149434 container create b409fd7a374be01ffac8f70d8446a9272fac2e3b1e4adec84812da5ce1c66e19 (image=docker.io/ceph/ceph:v15.2.8, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-crash.iz-ceph-v1-mon-02, ceph=True, GIT_BRANCH=HE[...]
Jan 18 07:17:31 iz-ceph-v1-mon-02 podman[1376]: 2021-01-18 07:17:31.285364304 +0100 CET m=+0.599962022 container create 8bdea20df5102420d7ab226ea893d078955d602cb020cb27e7b2466cbda078cd (image=docker.io/ceph/ceph:v15.2.8, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mds.cephfs.iz-ceph-v1-mon-02.cbdqzm, org.label-s[...]
Jan 18 07:17:31 iz-ceph-v1-mon-02 podman[1375]: 2021-01-18 07:17:31.389936576 +0100 CET m=+0.656427562 container create c5a891d63f6a0198f7c2e7d8dd0d322971e12d8351985408c956512791be8ef9 (image=docker.io/ceph/ceph:v15.2.8, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-mon.iz-ceph-v1-mon-02, GIT_BRANCH=HEAD, CEPH_POI[...]
[...]
Jan 18 07:19:35 iz-ceph-v1-mon-02 systemd[1]: Started libpod-conmon-2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca.scope.
Jan 18 07:19:35 iz-ceph-v1-mon-02 systemd[1769]: run-runc-2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca-runc.PSbC7o.mount: Succeeded.
Jan 18 07:19:35 iz-ceph-v1-mon-02 systemd[1]: run-runc-2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca-runc.PSbC7o.mount: Succeeded.
Jan 18 07:19:35 iz-ceph-v1-mon-02 podman[2082]: 2021-01-18 07:19:35.991805881 +0100 CET m=+0.185363292 container exec 2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca (image=docker.io/prom/node-exporter:latest, name=ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02, maintaine>
Jan 18 07:19:36 iz-ceph-v1-mon-02 systemd[1]: libpod-conmon-2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca.scope: Succeeded.
[...]
Jan 18 07:19:40 iz-ceph-v1-mon-02 systemd[1]: Stopped Ceph node-exporter.iz-ceph-v1-mon-02 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Jan 18 07:19:40 iz-ceph-v1-mon-02 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-02.service: Found left-over process 1152 (bash) in control group while starting unit. Ignoring.
Jan 18 07:19:40 iz-ceph-v1-mon-02 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 18 07:19:40 iz-ceph-v1-mon-02 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-02.service: Found left-over process 1159 (podman) in control group while starting unit. Ignoring.
Jan 18 07:19:40 iz-ceph-v1-mon-02 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 18 07:19:40 iz-ceph-v1-mon-02 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-mon-02.service: Found left-over process 1474 (conmon) in control group while starting unit. Ignoring.
Jan 18 07:19:40 iz-ceph-v1-mon-02 systemd[1]: This usually indicates unclean termination of a previous run, or service implementation deficiencies.
Jan 18 07:19:40 iz-ceph-v1-mon-02 systemd[1]: Starting Ceph node-exporter.iz-ceph-v1-mon-02 for 68317c90-b44f-11ea-a0c4-d1443a31407c...
Jan 18 07:19:40 iz-ceph-v1-mon-02 podman[2572]: Error: cannot remove container 2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca as it is running - running or paused containers cannot be removed without force: container state improper
And here are some entries from htop:
1152 root 20 0 6972 3272 3048 S 0.0 0.2 0:00.00 /bin/bash /var/lib/ceph/68317c90-b44f-11ea-a0c4-d1443a31407c/node-exporter.iz-ceph-v1-mon-02/unit.run
1164 root 20 0 1313M 50764 28108 S 0.0 2.5 0:00.02 /usr/bin/podman run --rm --net=host --user 65534 --name ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02 -e CONTAINER_IMAGE=prom/node-exporter -e NODE_NAME=iz-ceph-v1-mon-02 -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /:/roo
1165 root 20 0 1313M 50764 28108 S 0.0 2.5 0:00.05 /usr/bin/podman run --rm --net=host --user 65534 --name ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02 -e CONTAINER_IMAGE=prom/node-exporter -e NODE_NAME=iz-ceph-v1-mon-02 -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /:/roo
1168 root 20 0 1313M 50764 28108 S 0.0 2.5 0:00.00 /usr/bin/podman run --rm --net=host --user 65534 --name ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02 -e CONTAINER_IMAGE=prom/node-exporter -e NODE_NAME=iz-ceph-v1-mon-02 -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /:/roo
1179 root 20 0 1313M 50764 28108 S 0.0 2.5 0:00.00 /usr/bin/podman run --rm --net=host --user 65534 --name ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02 -e CONTAINER_IMAGE=prom/node-exporter -e NODE_NAME=iz-ceph-v1-mon-02 -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /:/roo
1202 root 20 0 1313M 50764 28108 S 0.0 2.5 0:00.00 /usr/bin/podman run --rm --net=host --user 65534 --name ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02 -e CONTAINER_IMAGE=prom/node-exporter -e NODE_NAME=iz-ceph-v1-mon-02 -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /:/roo
1204 root 20 0 1313M 50764 28108 S 0.0 2.5 0:00.01 /usr/bin/podman run --rm --net=host --user 65534 --name ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02 -e CONTAINER_IMAGE=prom/node-exporter -e NODE_NAME=iz-ceph-v1-mon-02 -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /:/roo
1255 root 20 0 1313M 50764 28108 S 0.0 2.5 0:00.00 /usr/bin/podman run --rm --net=host --user 65534 --name ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02 -e CONTAINER_IMAGE=prom/node-exporter -e NODE_NAME=iz-ceph-v1-mon-02 -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /:/roo
1385 root 20 0 1313M 50764 28108 S 0.0 2.5 0:00.04 /usr/bin/podman run --rm --net=host --user 65534 --name ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-mon-02 -e CONTAINER_IMAGE=prom/node-exporter -e NODE_NAME=iz-ceph-v1-mon-02 -v /proc:/host/proc:ro -v /sys:/host/sys:ro -v /:/roo
1477 root 20 0 80516 2008 1736 S 0.0 0.1 0:00.00 /usr/libexec/podman/conmon --api-version 1 -c 2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca -u 2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca -r /usr/sbin/runc -b /var/lib/containers/storage/overlay-containers/2e1cd
1474 root 20 0 80516 2008 1736 S 0.0 0.1 0:00.00 /usr/libexec/podman/conmon --api-version 1 -c 2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca -u 2e1cdf4c3564967faa8750836a71d41f309c597a0a27491638d70b8034cf88ca -r /usr/sbin/runc -b /var/lib/containers/storage/overlay-containers/2e1cd
1490 root 20 0 80516 2012 1740 S 0.0 0.1 0:00.00 /usr/libexec/podman/conmon --api-version 1 -c 8bdea20df5102420d7ab226ea893d078955d602cb020cb27e7b2466cbda078cd -u 8bdea20df5102420d7ab226ea893d078955d602cb020cb27e7b2466cbda078cd -r /usr/sbin/runc -b /var/lib/containers/storage/overlay-containers/8bdea
1488 root 20 0 80516 2012 1740 S 0.0 0.1 0:00.00 /usr/libexec/podman/conmon --api-version 1 -c 8bdea20df5102420d7ab226ea893d078955d602cb020cb27e7b2466cbda078cd -u 8bdea20df5102420d7ab226ea893d078955d602cb020cb27e7b2466cbda078cd -r /usr/sbin/runc -b /var/lib/containers/storage/overlay-containers/8bdea
1498 root 20 0 80516 2012 1736 S 0.0 0.1 0:00.00 /usr/libexec/podman/conmon --api-version 1 -c c5a891d63f6a0198f7c2e7d8dd0d322971e12d8351985408c956512791be8ef9 -u c5a891d63f6a0198f7c2e7d8dd0d322971e12d8351985408c956512791be8ef9 -r /usr/sbin/runc -b /var/lib/containers/storage/overlay-containers/c5a89
1508 root 20 0 80516 1964 1692 S 0.0 0.1 0:00.00 /usr/libexec/podman/conmon --api-version 1 -c b409fd7a374be01ffac8f70d8446a9272fac2e3b1e4adec84812da5ce1c66e19 -u b409fd7a374be01ffac8f70d8446a9272fac2e3b1e4adec84812da5ce1c66e19 -r /usr/sbin/runc -b /var/lib/containers/storage/overlay-containers/b409f
1506 root 20 0 80516 1964 1692 S 0.0 0.1 0:00.00 /usr/libexec/podman/conmon --api-version 1 -c b409fd7a374be01ffac8f70d8446a9272fac2e3b1e4adec84812da5ce1c66e19 -u b409fd7a374be01ffac8f70d8446a9272fac2e3b1e4adec84812da5ce1c66e19 -r /usr/sbin/runc -b /var/lib/containers/storage/overlay-containers/b409f
1556 root 20 0 80516 1964 1688 S 0.0 0.1 0:00.00 /usr/libexec/podman/conmon --api-version 1 -c 3c03159e79b9be1257c3642b7c9a6c42202cd1b657ec54bd1cf1d1cc2340aaa9 -u 3c03159e79b9be1257c3642b7c9a6c42202cd1b657ec54bd1cf1d1cc2340aaa9 -r /usr/sbin/runc -b /var/lib/containers/storage/overlay-containers/3c031
#11 Updated by Gunther Heinrich about 3 years ago
It seems that Ubuntu/Podman is not the cause for this issue because I am running into the same errors after updating a running testcluster from 15.2.5 to 15.2.8 without updating Ubuntu or Podman. Moreover, no issues were reported by the Orchestrator during that update process.
#12 Updated by Gunther Heinrich about 3 years ago
I did some analysis of the node-exporter on the osd node 3 to see what might be happening.
To me it looks as if the container simply fails to start, although I couldn't find out why. Could this be the result of the old containers returned by the refresh of the mgr?
Nevertheless because of this error several other issues surfaced which resulted in the following process:- The Ceph Service / Container tries to start
- After 120 seconds a timeout is triggered for the run command, Podman logs exit code 125 (The command fails for any other reason)
- Restart Attempt 1
- Container which failed to start is still present, was not stopped properly, error code 2 (One of the specified containers is paused or running)
- Restart Attempt 2
- Failed Container still present, error code 2
- Restart Attempt 3
- Failed Container still present, error code 2
- Restart Attempt 4
- Failed Container still present, error code 2
- Restart Attempt 5
- Podman logs that the start request repeated too quickly
And although the container ran into a timeout it is indeed present in podman ps and marked as fine and running:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
f89479f68523 docker.io/prom/node-exporter:latest --no-collector.ti... 44 minutes ago Up 44 minutes ago ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-osd-03
So the automatic removal of a container - at least in a failure state - doesn't seem to work properly.
Here is the first start attempt where it seems that the node-exporter container is indeed running and loggings messages:
Jan 22 10:32:19 iz-ceph-v1-osd-03 systemd[1]: Started libcontainer container f89479f685232f97599225337be16ace05f31519cf16c133841748fd48a7bcd9.
[...]
Jan 22 10:32:19 iz-ceph-v1-osd-03 podman[992]: 2021-01-22 10:32:19.427747009 +0100 CET m=+1.594842743 container init f89479f685232f97599225337be16ace05f31519cf16c133841748fd48a7bcd9
Jan 22 10:32:19 iz-ceph-v1-osd-03 podman[992]: 2021-01-22 10:32:19.515085874 +0100 CET m=+1.682181638 container start f89479f685232f97599225337be16ace05f31519cf16c133841748fd48a7bcd9
Jan 22 10:32:19 iz-ceph-v1-osd-03 podman[992]: 2021-01-22 10:32:19.516035647 +0100 CET m=+1.683131421 container attach f89479f685232f97599225337be16ace05f31519cf16c133841748fd48a7bcd9
[...]
Jan 22 10:32:21 iz-ceph-v1-osd-03 bash[992]: level=info ts=2021-01-22T09:32:21.250Z caller=node_exporter.go:177 msg="Starting node_exporter" version="(version=1.0.1, branch=HEAD, revision=3715be6ae899f2a9b9dbfd9c39f3e09a7bd>
Jan 22 10:32:21 iz-ceph-v1-osd-03 bash[992]: level=info ts=2021-01-22T09:32:21.251Z caller=node_exporter.go:178 msg="Build context" build_context="(go=go1.14.4, user=root@1f76dbbcfa55, date=20200616-12:44:12)"
Jan 22 10:32:21 iz-ceph-v1-osd-03 bash[992]: level=info ts=2021-01-22T09:32:21.271Z caller=node_exporter.go:105 msg="Enabled collectors"
[...]
Two minutes later the timeout message is logged which triggers the restart attempt process outlined above:
Jan 22 10:34:16 iz-ceph-v1-osd-03 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-osd-03.service: start operation timed out. Terminating.
Jan 22 10:34:16 iz-ceph-v1-osd-03 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-osd-03.service: Failed with result 'timeout'.
Jan 22 10:34:16 iz-ceph-v1-osd-03 systemd[1]: Failed to start Ceph node-exporter.iz-ceph-v1-osd-03 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
This is the status of the container service after the timeout (sudo systemctl status ceph-68317c90-b44f-11ea-a0c4-d1443a31407c @ node-exporter.iz-ceph-v1-osd-03.service):
ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-osd-03.service - Ceph node-exporter.iz-ceph-v1-osd-03 for 68317c90-b44f-11ea-a0c4-d1443a31407c
Loaded: loaded (/etc/systemd/system/ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@.service; enabled; vendor preset: enabled)
Active: activating (auto-restart) (Result: exit-code) since Fri 2021-01-22 10:34:47 CET; 4s ago
Process: 2355 ExecStartPre=/usr/bin/podman rm ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-osd-03 (code=exited, status=2)
Process: 2380 ExecStartPre=/bin/rm -f //run/ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-osd-03.service-pid //run/ceph-68317c [...] (code=exited, status=0/SUCCESS)
Process: 2381 ExecStart=/bin/bash /var/lib/ceph/68317c90-b44f-11ea-a0c4-d1443a31407c/node-exporter.iz-ceph-v1-osd-03/unit.run (code=exited, status=125)
Process: 2408 ExecStopPost=/bin/bash /var/lib/ceph/68317c90-b44f-11ea-a0c4-d1443a31407c/node-exporter.iz-ceph-v1-osd-03/unit.poststop (code=exited, status=0/SUCCESS)
Process: 2409 ExecStopPost=/bin/rm -f //run/ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-osd-03.service-pid //run/ceph-68317c [...] (code=exited, status=0/SUCCESS)
Tasks: 10 (limit: 2282)
Memory: 33.2M
CGroup: /system.slice/system-ceph\x2d68317c90\x2db44f\x2d11ea\x2da0c4\x2dd1443a31407c.slice/ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-osd-03.service
-- 989 /bin/bash /var/lib/ceph/68317c90-b44f-11ea-a0c4-d1443a31407c/node-exporter.iz-ceph-v1-osd-03/unit.run
-- 992 /usr/bin/podman run --rm --net=host --user 65534 --name ceph-68317c90-b44f-11ea-a0c4-d1443a31407c-node-exporter.iz-ceph-v1-osd-03 -e CONTAINER_IMAGE=prom/node-exporter -e NODE_NAME=iz-ceph-v1-osd-03 -v />
--1237 /usr/libexec/podman/conmon --api-version 1 -c f89479f685232f97599225337be16ace05f31519cf16c133841748fd48a7bcd9 -u f89479f685232f97599225337be16ace05f31519cf16c133841748fd48a7bcd9 -r /usr/sbin/runc -b /va>
Jan 22 10:34:47 iz-ceph-v1-osd-03 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-osd-03.service: Failed with result 'exit-code'.
Jan 22 10:34:47 iz-ceph-v1-osd-03 systemd[1]: Failed to start Ceph node-exporter.iz-ceph-v1-osd-03 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
And this is the status at the fifth restart attempt:
Jan 22 10:35:08 iz-ceph-v1-osd-03 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-osd-03.service: Start request repeated too quickly.
Jan 22 10:35:08 iz-ceph-v1-osd-03 systemd[1]: ceph-68317c90-b44f-11ea-a0c4-d1443a31407c@node-exporter.iz-ceph-v1-osd-03.service: Failed with result 'exit-code'.
Jan 22 10:35:08 iz-ceph-v1-osd-03 systemd[1]: Failed to start Ceph node-exporter.iz-ceph-v1-osd-03 for 68317c90-b44f-11ea-a0c4-d1443a31407c.
Should these problems be put into different issues/tickets?
#13 Updated by Gunther Heinrich about 3 years ago
I think I found the underlying issue of the container startup problems which is unrelated to the unrecognized options. I will open a new bug report for that.
#14 Updated by Sebastian Wagner about 3 years ago
- Related to Bug #49013: cephadm: Service definition causes some container startups to fail added
#15 Updated by Sebastian Wagner about 3 years ago
- Pull request ID set to 38927
#16 Updated by Sebastian Wagner about 3 years ago
- Status changed from In Progress to Resolved
#17 Updated by Neha Ojha about 3 years ago
Sebastian, I am seeing something similar in pacific upgrade tests, want me create a new tracker issue?
rados/cephadm/upgrade/{1-start 2-repo_digest/defaut 3-start-upgrade 4-wait distro$/{ubuntu_18.04} fixed-2 mon_election/classic}
2021-02-23T01:45:31.520 INFO:journalctl@ceph.mgr.x.smithi155.stdout:Feb 23 01:45:31 smithi155 bash[21259]: debug 2021-02-23T01:45:31.098+0000 7fb6565e0700 -1 log_channel(cephadm) log [ERR] : cephadm exited with an error code: 1, stderr:/usr/bin/docker: usage: ceph-volume inventory [-h] [--format {plain,json,json-pretty}] [path] 2021-02-23T01:45:31.520 INFO:journalctl@ceph.mgr.x.smithi155.stdout:Feb 23 01:45:31 smithi155 bash[21259]: /usr/bin/docker: ceph-volume inventory: error: unrecognized arguments: --filter-for-batch 2021-02-23T01:45:31.520 INFO:journalctl@ceph.mgr.x.smithi155.stdout:Feb 23 01:45:31 smithi155 bash[21259]: Traceback (most recent call last): 2021-02-23T01:45:31.521 INFO:journalctl@ceph.mgr.x.smithi155.stdout:Feb 23 01:45:31 smithi155 bash[21259]: File "<stdin>", line 7713, in <module> 2021-02-23T01:45:31.521 INFO:journalctl@ceph.mgr.x.smithi155.stdout:Feb 23 01:45:31 smithi155 bash[21259]: File "<stdin>", line 7702, in main 2021-02-23T01:45:31.521 INFO:journalctl@ceph.mgr.x.smithi155.stdout:Feb 23 01:45:31 smithi155 bash[21259]: File "<stdin>", line 1617, in _infer_fsid 2021-02-23T01:45:31.521 INFO:journalctl@ceph.mgr.x.smithi155.stdout:Feb 23 01:45:31 smithi155 bash[21259]: File "<stdin>", line 1701, in _infer_image 2021-02-23T01:45:31.521 INFO:journalctl@ceph.mgr.x.smithi155.stdout:Feb 23 01:45:31 smithi155 bash[21259]: File "<stdin>", line 4308, in command_ceph_volume 2021-02-23T01:45:31.522 INFO:journalctl@ceph.mgr.x.smithi155.stdout:Feb 23 01:45:31 smithi155 bash[21259]: File "<stdin>", line 1380, in call_throws/a/nojha-2021-02-22_23:52:36-rados-pacific-distro-basic-smithi/5904592