Bug #56173
closedcephadm: fail to deploy monitor
0%
Description
Found in: /a/ksirivad-2022-05-17_02:51:57-orch:cephadm-wip-ksirivad-recreate-50089-distro-basic-smithi/6836958/
Process for recreating the bug:
host.1: mon.a
host.2: mon.b
host.3: mon.c
host.4: mon.d
host,5: mon.e
ceph orch apply mon 3
..
smithi064: mon.a
smithi143: mon.b
smithi145: mon.c
smithi163:
smithi190:
ceph orch apply mon 5
smithi064: mon.a
smithi143: mon.b
smithi145: mon.c
smithi163: -------> trying to deploy mon.smithi163 but Failed!
smithi190:
/a/ksirivad-2022-05-17_02:51:57-orch:cephadm-wip-ksirivad-recreate-50089-distro-basic-smithi/6836958/remote/smithi143/log/f363e282-d58e-11ec-be40-001a4aab830c/ceph-mgr.b.log.gz
Deploy daemon mon.smithi163 ... Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph-mon --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 -e NODE_NAME=smithi163 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/f363e282-d58e-11ec-be40-001a4aab830c:/var/log/ceph:z -v /var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/mon.smithi163:/var/lib/ceph/mon/ceph-smithi163:z -v /tmp/ceph-tmpoib7u6u8:/tmp/keyring:z -v /tmp/ceph-tmp7a1yg9c8:/tmp/config:z quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 --mkfs -i smithi163 --fsid f363e282-d58e-11ec-be40-001a4aab830c -c /tmp/config --keyring /tmp/keyring --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug --default-mon-cluster-log-to-file=false --default-mon-cluster-log-to-stderr=true /usr/bin/ceph-mon: stderr debug 2022-05-17T03:37:16.071+0000 7ff093b2a880 0 set uid:gid to 167:167 (ceph:ceph) Traceback (most recent call last): File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 9184, in <module> main() File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 9172, in main r = ctx.func(ctx) File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 2087, in _default_image return func(ctx) File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 5735, in command_deploy deploy_daemon(ctx, ctx.fsid, daemon_type, daemon_id, c, uid, gid, File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 3105, in deploy_daemon CephContainer( File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 3951, in run out, _, _ = call_throws(self.ctx, self.run_cmd(), File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 1739, in call_throws raise RuntimeError(f'Failed command: {" ".join(command)}: {s}') RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph-mon --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 -e NODE_NAME=smithi163 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/f363e282-d58e-11ec-be40-001a4aab830c:/var/log/ceph:z -v /var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/mon.smithi163:/var/lib/ceph/mon/ceph-smithi163:z -v /tmp/ceph-tmpoib7u6u8:/tmp/keyring:z -v /tmp/ceph-tmp7a1yg9c8:/tmp/config:z quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 --mkfs -i smithi163 --fsid f363e282-d58e-11ec-be40-001a4aab830c -c /tmp/config --keyring /tmp/keyring --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug --default-mon-cluster-log-to-file=false --default-mon-cluster-log-to-stderr=true: debug 2022-05-17T03:37:16.071+0000 7ff093b2a880 0 set uid:gid to 167:167 (ceph:ceph) 2022-05-17T03:37:16.947+0000 7f6b28096700 0 [cephadm ERROR root] Failed while placing mon.smithi163 on smithi163: cephadm exited with an error code: 1, stderr: Non-zero exit code 1 from /usr/bin/docker container inspect --format {{.State.Status}} ceph-f363e282-d58e-11ec-be40-001a4aab830c-mon-smithi163 /usr/bin/docker: stdout /usr/bin/docker: stderr Error: No such container: ceph-f363e282-d58e-11ec-be40-001a4aab830c-mon-smithi163 Non-zero exit code 1 from /usr/bin/docker container inspect --format {{.State.Status}} ceph-f363e282-d58e-11ec-be40-001a4aab830c-mon.smithi163 /usr/bin/docker: stdout /usr/bin/docker: stderr Error: No such container: ceph-f363e282-d58e-11ec-be40-001a4aab830c-mon.smithi163 Deploy daemon mon.smithi163 ... Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph-mon --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 -e NODE_NAME=smithi163 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/f363e282-d58e-11ec-be40-001a4aab830c:/var/log/ceph:z -v /var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/mon.smithi163:/var/lib/ceph/mon/ceph-smithi163:z -v /tmp/ceph-tmpoib7u6u8:/tmp/keyring:z -v /tmp/ceph-tmp7a1yg9c8:/tmp/config:z quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 --mkfs -i smithi163 --fsid f363e282-d58e-11ec-be40-001a4aab830c -c /tmp/config --keyring /tmp/keyring --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug --default-mon-cluster-log-to-file=false --default-mon-cluster-log-to-stderr=true
Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago
- Description updated (diff)
Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago
- Description updated (diff)
Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago
- Subject changed from cephadm:Fail to deploy monitor to cephadm: fail to deploy monitor
Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago
Link to test file of the PR that was used to test: https://github.com/ceph/ceph/pull/45511/files#diff-3d6025e5fcf280c6cc110b22bc6de900739206c0f157ce187a5eba1f809a41c1
Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago
According to this test result:
Test result for the following PRs combined: https://github.com/ceph/ceph/pull/45511 - cephadm/services/cephadmservice: shutdown monitors before removing them https://github.com/ceph/ceph/pull/46740 - src/mon/Paxos: Provide paxos with mon shutdown info cherry-picked only (5e9348bce9557410e8a9291b96ba1edc7b7f7a10) since the other commit are just loggings for debuggin purposes. 30 Jobs - 29/30 passed 1 Dead Job unrelated to Stray Daemon or anything to do with monitors. https://pulpito.ceph.com/ksirivad-2022-06-28_12:50:56-orch:cephadm-wip-ksirivad-test-46749-and-45511-distro-default-smithi/ 10 Jobs - 8/10 passed 1 Dead job, 1 Failure unrelated to Stray Daemon or anything to do with monitors https://pulpito.ceph.com/ksirivad-2022-06-27_21:11:03-orch:cephadm-wip-ksirivad-test-46749-and-45511-distro-default-smithi/ 10 Jobs - 10/10 passed https://pulpito.ceph.com/ksirivad-2022-06-27_20:01:50-orch:cephadm-wip-ksirivad-test-46749-and-45511-distro-default-smithi/
marking this bug as irreproducible
Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago
- Status changed from New to Can't reproduce