Project

General

Profile

Actions

Bug #46038

closed

cephadm mon start failure: Failed to reset failed state of unit ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec@mon.senta04.service

Added by Deepika Upadhyay almost 4 years ago. Updated almost 2 years ago.

Status:
Closed
Priority:
Low
Assignee:
-
Category:
cephadm
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

commmand used:

sudo ./cephadm --verbose  bootstrap --mon-ip 172.21.9.34 --skip-pull --allow-overwrite --skip-firewalld

DEBUG:cephadm:/usr/bin/ceph-mon:stderr "debug "2020-06-16T13:30:52.854+0000 7f4405d346c0  0 /usr/bin/ceph-mon: created monfs at /var/lib/ceph/mon/ceph-senta04 for mon.senta04
DEBUG:cephadm:Running command: install -d -m0770 -o 167 -g 167 /var/run/ceph/9342dcfe-afd5-11ea-8901-ff131eda9bec
DEBUG:cephadm:Running command: systemctl enable ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec.target
DEBUG:cephadm:systemctl:stderr Created symlink /etc/systemd/system/multi-user.target.wants/ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec.target → /etc/systemd/system/ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec.target.
DEBUG:cephadm:systemctl:stderr Created symlink /etc/systemd/system/ceph.target.wants/ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec.target → /etc/systemd/system/ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec.target.
DEBUG:cephadm:Running command: systemctl start ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec.target
DEBUG:cephadm:Running command: systemctl daemon-reload
DEBUG:cephadm:Running command: systemctl stop ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec@mon.senta04
DEBUG:cephadm:Running command: systemctl reset-failed ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec@mon.senta04
DEBUG:cephadm:systemctl:stderr Failed to reset failed state of unit ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec@mon.senta04.service: Unit ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec@mon.senta04.service not loaded.
DEBUG:cephadm:Running command: systemctl enable ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec@mon.senta04
DEBUG:cephadm:systemctl:stderr Created symlink /etc/systemd/system/ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec.target.wants/ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec@mon.senta04.service → /etc/systemd/system/ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec@.service.
DEBUG:cephadm:Running command: systemctl start ceph-9342dcfe-afd5-11ea-8901-ff131eda9bec@mon.senta04
INFO:cephadm:Waiting for mon to start...
INFO:cephadm:Waiting for mon...
DEBUG:cephadm:Running command: /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=docker.io/ceph/daemon-base:latest-master-devel -e NODE_NAME=senta04 -v /var/lib/ceph/9342dcfe-afd5-11ea-8901-ff131eda9bec/mon.senta04:/var/lib/ceph/mon/ceph-senta04:z -v /tmp/ceph-tmpmgqrx545:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmprt2y28oq:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph docker.io/ceph/daemon-base:latest-master-devel status
INFO:cephadm:/usr/bin/ceph:timeout after 30 seconds
INFO:cephadm:Non-zero exit code -9 from /usr/bin/podman run --rm --net=host --ipc=host -e CONTAINER_IMAGE=docker.io/ceph/daemon-base:latest-master-devel -e NODE_NAME=senta04 -v /var/lib/ceph/9342dcfe-afd5-11ea-8901-ff131eda9bec/mon.senta04:/var/lib/ceph/mon/ceph-senta04:z -v /tmp/ceph-tmpmgqrx545:/etc/ceph/ceph.client.admin.keyring:z -v /tmp/ceph-tmprt2y28oq:/etc/ceph/ceph.conf:z --entrypoint /usr/bin/ceph docker.io/ceph/daemon-base:latest-master-devel status
INFO:cephadm:mon not available, waiting (1/10)...

complete log: https://paste.centos.org/view/a19d7e21

Actions #1

Updated by Deepika Upadhyay almost 4 years ago

recurrent failure observed in Fedora 31 and Ubuntu 18.04( although after removing the stale cluster rerun on Ubuntu seems to not fail on this point)

root cause seems to be:

DEBUG:cephadm:systemctl:stderr Failed to reset failed state of unit ceph-e4e321fc-b734-11ea-b4da-0025900809c6@mon.senta02.service: Unit ceph-e4e321fc-b734-11ea-b4da-0025900809c6@mon.senta02.service is not loaded.

Actions #2

Updated by Sebastian Wagner almost 4 years ago

the logs are gone. Maybe should put logs here into the tracker.

Actions #3

Updated by Sebastian Wagner almost 4 years ago

  • Status changed from New to Can't reproduce

feel free to reopen the issue!

Actions #4

Updated by Deepika Upadhyay over 3 years ago

observed in octopus:
http://qa-proxy.ceph.com/teuthology/yuriw-2020-10-30_15:36:14-rados-wip-yuri-testing-2020-10-28-0947-octopus-distro-basic-smithi/5575859/teuthology.log

 cluster [DBG] mgrmap e1: no daemons active
2020-10-30T21:53:29.334 INFO:teuthology.orchestra.run.smithi171.stderr:Creating mgr...
2020-10-30T21:53:29.334 INFO:teuthology.orchestra.run.smithi171.stderr:Verifying port 9283 ...
2020-10-30T21:53:29.336 INFO:teuthology.orchestra.run.smithi171.stderr:Running command: systemctl daemon-reload
2020-10-30T21:53:29.458 INFO:teuthology.orchestra.run.smithi171.stderr:Running command: systemctl stop ceph-428f0b9e-1afa-11eb-a2ad-001a4aab830c@mgr.y
2020-10-30T21:53:29.469 INFO:teuthology.orchestra.run.smithi171.stderr:Running command: systemctl reset-failed ceph-428f0b9e-1afa-11eb-a2ad-001a4aab830c@mgr.y
2020-10-30T21:53:29.476 INFO:teuthology.orchestra.run.smithi171.stderr:systemctl:stderr Failed to reset failed state of unit ceph-428f0b9e-1afa-11eb-a2ad-001a4aab830c@mgr.y.service: Unit ceph-428f0b9e-1afa-11eb-a2ad-001a4aab830c@mgr.y.service is not loaded.
Actions #5

Updated by Deepika Upadhyay over 2 years ago

  • Status changed from Can't reproduce to New

hey Sebastian/orch team! I added cephadm based iscsi tests recently and am observing this failure after rebase, can you help me get around this issue

2021-08-12T05:55:38.937 INFO:teuthology.orchestra.run.smithi029.stderr:systemctl: Failed to reset failed state of unit ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@mon.a.service: Unit ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@mon.a.service not loaded.
2021-08-12T05:55:38.938 INFO:teuthology.orchestra.run.smithi029.stderr:Running command: systemctl enable ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@mon.a
2021-08-12T05:55:38.945 INFO:teuthology.orchestra.run.smithi029.stderr:systemctl: Created symlink /etc/systemd/system/ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c.target.wants/ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@mon.a.service → /etc/systemd/system/ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@.service.
2021-08-12T05:55:39.035 INFO:teuthology.orchestra.run.smithi029.stderr:Running command: systemctl start ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@mon.a
2021-08-12T05:55:39.203 INFO:journalctl@ceph.mon.a.smithi029.stdout:Aug 12 05:55:39 smithi029 systemd[1]: Starting Ceph mon.a for c2c6237a-fb31-11eb-8c24-001a4aab830c...
2021-08-12T05:55:39.790 INFO:teuthology.orchestra.run.smithi029.stderr:systemctl: Job for ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@mon.a.service failed because the control process exited with error code.
2021-08-12T05:55:39.791 INFO:teuthology.orchestra.run.smithi029.stderr:systemctl: See "systemctl status ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@mon.a.service" and "journalctl -xe" for details.
2021-08-12T05:55:39.792 INFO:teuthology.orchestra.run.smithi029.stderr:Non-zero exit code 1 from systemctl start ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@mon.a
2021-08-12T05:55:39.792 INFO:teuthology.orchestra.run.smithi029.stderr:systemctl: stderr Job for ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@mon.a.service failed because the control process exited with error code.
2021-08-12T05:55:39.792 INFO:teuthology.orchestra.run.smithi029.stderr:systemctl: stderr See "systemctl status ceph-c2c6237a-fb31-11eb-8c24-001a4aab830c@mon.a.service" and "journalctl -xe" for details.

http://qa-proxy.ceph.com/teuthology/ideepika-2021-08-12_05:28:35-rbd:iscsi-wip-rbd-update-feature-distro-basic-smithi/6335452/teuthology.log

Actions #6

Updated by Redouane Kachach Elhichou almost 2 years ago

  • Priority changed from Normal to Low
Actions #7

Updated by Redouane Kachach Elhichou almost 2 years ago

  • Status changed from New to Closed

Too old issue which doesn't seem to happen anymore. Please, feel free to reopen if you can reproduce.

Actions

Also available in: Atom PDF