Project

General

Profile

Actions

Bug #53545

closed

rados/cephadm/mgr-nfs-upgrade failures due to CEPHADM_DAEMON_PLACE_FAIL

Added by Neha Ojha over 2 years ago. Updated over 2 years ago.

Status:
Duplicate
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

These upgrade tests die after 12 hours and this is what I found in the log (which may or may not be the cause of the failure).

2021-12-07T22:57:02.754 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: +0000 mgr.smithi080.zarhmj (mgr.24431) 42 : cephadm [ERR]
2021-12-07T22:57:02.755 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]:  Failed while placing nfs.foo.1.0.smithi080.feiumo on smithi080: cephadm exited with an error code: 1, stderr: Non-zero exit code 125 from /bin/podman container inspect --format {{.State.Status}} ceph-e9d71ca0-57af-11ec-8c2e-001a4aab830c-nfs-foo-1-0-smithi080-feiumo
2021-12-07T22:57:02.755 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: /bin/podman: stderr Error: error inspecting object: no such container ceph-e9d71ca0-57af-11ec-8c2e-001a4aab830c-nfs-foo-1-0-smithi080-feiumo
2021-12-07T22:57:02.755 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: Non-zero exit code 125 from /bin/podman container inspect --format {{.State.Status}} ceph-e9d71ca0-57af-11ec-8c2e-001a4aab830c-nfs.foo.1.0.smithi080.feiumo
2021-12-07T22:57:02.755 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: /bin/podman: stderr Error: error inspecting object: no such container ceph-e9d71ca0-57af-11ec-8c2e-001a4aab830c-nfs.foo.1.0.smithi080.feiumo
2021-12-07T22:57:02.755 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: Deploy daemon nfs.foo.1.0.smithi080.feiumo ...
2021-12-07T22:57:02.756 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: Verifying port 2049 ...
2021-12-07T22:57:02.756 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: Cannot bind to IP 0.0.0.0 port 2049: [Errno 98] Address already in use
2021-12-07T22:57:02.756 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: ERROR: TCP Port(s) '2049' required for nfs already in use
2021-12-07T22:57:02.756 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: cluster 2021-12-07T22:57:01.384120+0000 mgr.smithi080.zarhmj (mgr.24431) 43 : cluster
2021-12-07T22:57:02.757 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: [DBG] pgmap v20: 129 pgs: 32 creating+peering, 97 active+clean; 300 MiB data, 902 MiB used, 706 GiB / 715 GiB avail; 10 KiB/s rd, 5.0 MiB/s wr, 627 op/s
2021-12-07T22:57:02.757 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: cephadm 2021-12
2021-12-07T22:57:02.757 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: -07T22:57:01
2021-12-07T22:57:02.757 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: .385448+0000 mgr.smithi080.zarhmj (mgr.24431
2021-12-07T22:57:02.757 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: ) 44 : cephadm [INF] Removing orphan daemon nfs.ganesha-foo.smithi007...
2021-12-07T22:57:02.758 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: cephadm 2021-12
2021-12-07T22:57:02.758 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: -07T22:57:01.385552+0000 mgr.smithi080.zarhmj
2021-12-07T22:57:02.758 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]:  (mgr.24431) 45 : cephadm [INF] Removing daemon nfs.ganesha-foo.smithi007 from smithi007
2021-12-07T22:57:02.758 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: audit 2021
2021-12-07T22:57:02.759 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: -12-07T22:57:01.385944+0000 mon.smithi007 (mon.
2021-12-07T22:57:02.759 INFO:journalctl@ceph.mon.smithi080.smithi080.stdout:Dec 07 22:57:02 smithi080 conmon[17925]: 0) 960 : audit [DBG] from='mgr.24431 172.21.15.80:0/3301938698' entity='mgr.smithi080.zarhmj' cmd=[{"prefix": "config get","who": "client.nfs.ganesha-foo.smithi007","key": "container_image"}]: dispatch
2021-12-07T22:57:03.565 INFO:teuthology.orchestra.run.smithi007.stdout:   5    106988    11.97 MB/sec  execute  26 sec  latency 1552.691 ms
2021-12-07T22:57:03.683 INFO:journalctl@ceph.mon.smithi007.smithi007.stdout:Dec 07 22:57:03 smithi007 conmon[18857]: cluster 2021-12-07T22:57:02.380103+0000 mon.smithi007 (mon.0) 961 : cluster
2021-12-07T22:57:03.684 INFO:journalctl@ceph.mon.smithi007.smithi007.stdout:Dec 07 22:57:03 smithi007 conmon[18857]: [WRN] Health check failed: Failed to place 2 daemon(s) (CEPHADM_DAEMON_PLACE_FAIL)

/a/yuriw-2021-12-07_16:04:59-rados-wip-yuri5-testing-2021-12-06-1619-distro-default-smithi/6550815


Related issues 1 (0 open1 closed)

Is duplicate of Orchestrator - Bug #53424: CEPHADM_DAEMON_PLACE_FAIL in orch:cephadm/mgr-nfs-upgrade/ResolvedSebastian Wagner

Actions
Actions #1

Updated by Sebastian Wagner over 2 years ago

  • Is duplicate of Bug #53424: CEPHADM_DAEMON_PLACE_FAIL in orch:cephadm/mgr-nfs-upgrade/ added
Actions #2

Updated by Sebastian Wagner over 2 years ago

  • Status changed from New to Duplicate
Actions

Also available in: Atom PDF