Project

General

Profile

Actions

Bug #56173

closed

cephadm: fail to deploy monitor

Added by Kamoltat (Junior) Sirivadhna almost 2 years ago. Updated almost 2 years ago.

Status:
Can't reproduce
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Found in: /a/ksirivad-2022-05-17_02:51:57-orch:cephadm-wip-ksirivad-recreate-50089-distro-basic-smithi/6836958/

Process for recreating the bug:

host.1: mon.a
host.2: mon.b
host.3: mon.c
host.4: mon.d
host,5: mon.e

ceph orch apply mon 3

..

smithi064: mon.a
smithi143: mon.b
smithi145: mon.c
smithi163:
smithi190:

ceph orch apply mon 5

smithi064: mon.a
smithi143: mon.b
smithi145: mon.c
smithi163: -------> trying to deploy mon.smithi163 but Failed!
smithi190:

/a/ksirivad-2022-05-17_02:51:57-orch:cephadm-wip-ksirivad-recreate-50089-distro-basic-smithi/6836958/remote/smithi143/log/f363e282-d58e-11ec-be40-001a4aab830c/ceph-mgr.b.log.gz

Deploy daemon mon.smithi163 ...
Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph-mon --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 -e NODE_NAME=smithi163 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/f363e282-d58e-11ec-be40-001a4aab830c:/var/log/ceph:z -v /var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/mon.smithi163:/var/lib/ceph/mon/ceph-smithi163:z -v /tmp/ceph-tmpoib7u6u8:/tmp/keyring:z -v /tmp/ceph-tmp7a1yg9c8:/tmp/config:z quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 --mkfs -i smithi163 --fsid f363e282-d58e-11ec-be40-001a4aab830c -c /tmp/config --keyring /tmp/keyring --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug  --default-mon-cluster-log-to-file=false --default-mon-cluster-log-to-stderr=true
/usr/bin/ceph-mon: stderr debug 2022-05-17T03:37:16.071+0000 7ff093b2a880  0 set uid:gid to 167:167 (ceph:ceph)
Traceback (most recent call last):
  File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 9184, in <module>
    main()
  File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 9172, in main
    r = ctx.func(ctx)
  File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 2087, in _default_image
    return func(ctx)
  File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 5735, in command_deploy
    deploy_daemon(ctx, ctx.fsid, daemon_type, daemon_id, c, uid, gid,
  File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 3105, in deploy_daemon
    CephContainer(
  File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 3951, in run
    out, _, _ = call_throws(self.ctx, self.run_cmd(),
  File "/var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/cephadm.fb91061980b87e19488e608c5dc7c41051d206a17e671583fd219db7589a28c1", line 1739, in call_throws
    raise RuntimeError(f'Failed command: {" ".join(command)}: {s}')
RuntimeError: Failed command: /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph-mon --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 -e NODE_NAME=smithi163 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/f363e282-d58e-11ec-be40-001a4aab830c:/var/log/ceph:z -v /var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/mon.smithi163:/var/lib/ceph/mon/ceph-smithi163:z -v /tmp/ceph-tmpoib7u6u8:/tmp/keyring:z -v /tmp/ceph-tmp7a1yg9c8:/tmp/config:z quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 --mkfs -i smithi163 --fsid f363e282-d58e-11ec-be40-001a4aab830c -c /tmp/config --keyring /tmp/keyring --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug  --default-mon-cluster-log-to-file=false --default-mon-cluster-log-to-stderr=true: debug 2022-05-17T03:37:16.071+0000 7ff093b2a880  0 set uid:gid to 167:167 (ceph:ceph)
2022-05-17T03:37:16.947+0000 7f6b28096700  0 [cephadm ERROR root] Failed while placing mon.smithi163 on smithi163: cephadm exited with an error code: 1, stderr: Non-zero exit code 1 from /usr/bin/docker container inspect --format {{.State.Status}} ceph-f363e282-d58e-11ec-be40-001a4aab830c-mon-smithi163
/usr/bin/docker: stdout
/usr/bin/docker: stderr Error: No such container: ceph-f363e282-d58e-11ec-be40-001a4aab830c-mon-smithi163
Non-zero exit code 1 from /usr/bin/docker container inspect --format {{.State.Status}} ceph-f363e282-d58e-11ec-be40-001a4aab830c-mon.smithi163
/usr/bin/docker: stdout
/usr/bin/docker: stderr Error: No such container: ceph-f363e282-d58e-11ec-be40-001a4aab830c-mon.smithi163
Deploy daemon mon.smithi163 ...
Non-zero exit code 1 from /usr/bin/docker run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint /usr/bin/ceph-mon --init -e CONTAINER_IMAGE=quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 -e NODE_NAME=smithi163 -e CEPH_USE_RANDOM_NONCE=1 -v /var/log/ceph/f363e282-d58e-11ec-be40-001a4aab830c:/var/log/ceph:z -v /var/lib/ceph/f363e282-d58e-11ec-be40-001a4aab830c/mon.smithi163:/var/lib/ceph/mon/ceph-smithi163:z -v /tmp/ceph-tmpoib7u6u8:/tmp/keyring:z -v /tmp/ceph-tmp7a1yg9c8:/tmp/config:z quay.ceph.io/ceph-ci/ceph@sha256:0bb644126a82362e359723daea45aa67ad20d7a2713ecbf1f63fc11da9985de3 --mkfs -i smithi163 --fsid f363e282-d58e-11ec-be40-001a4aab830c -c /tmp/config --keyring /tmp/keyring --setuser ceph --setgroup ceph --default-log-to-file=false --default-log-to-stderr=true --default-log-stderr-prefix=debug  --default-mon-cluster-log-to-file=false --default-mon-cluster-log-to-stderr=true
Actions #1

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Description updated (diff)
Actions #2

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Description updated (diff)
Actions #3

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Subject changed from cephadm:Fail to deploy monitor to cephadm: fail to deploy monitor
Actions #5

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

According to this test result:

Test result for the following PRs combined:

https://github.com/ceph/ceph/pull/45511 - cephadm/services/cephadmservice: shutdown monitors before removing them
https://github.com/ceph/ceph/pull/46740 - src/mon/Paxos: Provide paxos with mon shutdown info
    cherry-picked only (5e9348bce9557410e8a9291b96ba1edc7b7f7a10) since the other commit are     just loggings for debuggin purposes. 

30 Jobs - 29/30 passed 1 Dead Job unrelated to Stray Daemon or anything to do with monitors.
https://pulpito.ceph.com/ksirivad-2022-06-28_12:50:56-orch:cephadm-wip-ksirivad-test-46749-and-45511-distro-default-smithi/

10 Jobs -  8/10 passed 1 Dead job, 1 Failure  unrelated to Stray Daemon or anything to do with monitors
https://pulpito.ceph.com/ksirivad-2022-06-27_21:11:03-orch:cephadm-wip-ksirivad-test-46749-and-45511-distro-default-smithi/ 

10 Jobs - 10/10 passed 
https://pulpito.ceph.com/ksirivad-2022-06-27_20:01:50-orch:cephadm-wip-ksirivad-test-46749-and-45511-distro-default-smithi/ 

marking this bug as irreproducible

Actions #6

Updated by Kamoltat (Junior) Sirivadhna almost 2 years ago

  • Status changed from New to Can't reproduce
Actions

Also available in: Atom PDF