Bug #49539: test_cephadm.sh failure: container.alertmanager.a does'nt start - Orchestrator - Ceph

Actions

Copy link

Bug #49539

closed

test_cephadm.sh failure: container.alertmanager.a does'nt start

Added by Sage Weil about 3 years ago. Updated about 3 years ago.

Status:

Resolved

Priority:

High

Assignee:

Category:

Target version:

% Done:

Source:

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

2021-02-28T19:44:39.976 INFO:tasks.workunit.client.0.smithi031.stderr:+ sudo /home/ubuntu/cephtest/clone.client.0/qa/workunits/cephadm/../../../src/cephadm/cephadm --image docker.io/prom/alertmanager:v0.20.0 deploy --tcp-ports '9093 9094' --name container.alertmanager.a --fsid 00000000-0000-0000-0000-0000deadbeef --config-json -
2021-02-28T19:44:40.344 INFO:tasks.workunit.client.0.smithi031.stderr:Deploy daemon container.alertmanager.a ...
2021-02-28T19:44:40.345 INFO:tasks.workunit.client.0.smithi031.stderr:Verifying port 9093 ...
2021-02-28T19:44:40.346 INFO:tasks.workunit.client.0.smithi031.stderr:Verifying port 9094 ...
2021-02-28T19:44:40.346 INFO:tasks.workunit.client.0.smithi031.stderr:Verifying port 9093 ...
2021-02-28T19:44:40.346 INFO:tasks.workunit.client.0.smithi031.stderr:Verifying port 9094 ...
2021-02-28T19:44:40.347 INFO:tasks.workunit.client.0.smithi031.stderr:Creating custom container configuration dirs/files in /var/lib/ceph/00000000-0000-0000-0000-0000deadbeef/container.alertmanager.a ...
2021-02-28T19:44:40.347 INFO:tasks.workunit.client.0.smithi031.stderr:Creating directory: etc/alertmanager
2021-02-28T19:44:40.347 INFO:tasks.workunit.client.0.smithi031.stderr:Creating file: etc/alertmanager/alertmanager.yml
2021-02-28T19:44:41.189 INFO:tasks.workunit.client.0.smithi031.stderr:Non-zero exit code 1 from systemctl start ceph-00000000-0000-0000-0000-0000deadbeef@container.alertmanager.a
2021-02-28T19:44:41.189 INFO:tasks.workunit.client.0.smithi031.stderr:systemctl: stderr Job for ceph-00000000-0000-0000-0000-0000deadbeef@container.alertmanager.a.service failed because the control process exited with error code.
2021-02-28T19:44:41.189 INFO:tasks.workunit.client.0.smithi031.stderr:systemctl: stderr See "systemctl status ceph-00000000-0000-0000-0000-0000deadbeef@container.alertmanager.a.service" and "journalctl -xe" for details.
2021-02-28T19:44:41.292 INFO:tasks.workunit.client.0.smithi031.stderr:Traceback (most recent call last):
2021-02-28T19:44:41.293 INFO:tasks.workunit.client.0.smithi031.stderr:  File "/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephadm/../../../src/cephadm/cephadm", line 7846, in <module>
2021-02-28T19:44:41.293 INFO:tasks.workunit.client.0.smithi031.stderr:    main()
2021-02-28T19:44:41.293 INFO:tasks.workunit.client.0.smithi031.stderr:  File "/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephadm/../../../src/cephadm/cephadm", line 7835, in main
2021-02-28T19:44:41.293 INFO:tasks.workunit.client.0.smithi031.stderr:    r = ctx.func(ctx)
2021-02-28T19:44:41.294 INFO:tasks.workunit.client.0.smithi031.stderr:  File "/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephadm/../../../src/cephadm/cephadm", line 1690, in _default_image
2021-02-28T19:44:41.294 INFO:tasks.workunit.client.0.smithi031.stderr:    return func(ctx)
2021-02-28T19:44:41.294 INFO:tasks.workunit.client.0.smithi031.stderr:  File "/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephadm/../../../src/cephadm/cephadm", line 4168, in command_deploy
2021-02-28T19:44:41.294 INFO:tasks.workunit.client.0.smithi031.stderr:    deploy_daemon(ctx, ctx.fsid, daemon_type, daemon_id, c,
2021-02-28T19:44:41.294 INFO:tasks.workunit.client.0.smithi031.stderr:  File "/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephadm/../../../src/cephadm/cephadm", line 2526, in deploy_daemon
2021-02-28T19:44:41.295 INFO:tasks.workunit.client.0.smithi031.stderr:    deploy_daemon_units(ctx, fsid, uid, gid, daemon_type, daemon_id,
2021-02-28T19:44:41.295 INFO:tasks.workunit.client.0.smithi031.stderr:  File "/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephadm/../../../src/cephadm/cephadm", line 2726, in deploy_daemon_units
2021-02-28T19:44:41.295 INFO:tasks.workunit.client.0.smithi031.stderr:    call_throws(ctx, ['systemctl', 'start', unit_name])
2021-02-28T19:44:41.295 INFO:tasks.workunit.client.0.smithi031.stderr:  File "/home/ubuntu/cephtest/clone.client.0/qa/workunits/cephadm/../../../src/cephadm/cephadm", line 1384, in call_throws
2021-02-28T19:44:41.295 INFO:tasks.workunit.client.0.smithi031.stderr:    raise RuntimeError('Failed command: %s' % ' '.join(command))
2021-02-28T19:44:41.296 INFO:tasks.workunit.client.0.smithi031.stderr:RuntimeError: Failed command: systemctl start ceph-00000000-0000-0000-0000-0000deadbeef@container.alertmanager.a

/a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-1217-distro-basic-smithi/5921275

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Sebastian Wagner about 3 years ago

Is duplicate of Bug #48799: test_cephadm: stderr Job for container.alertmanager.a.service failed because a timeout was exceeded. added

Actions

Copy link

Updated by Sebastian Wagner about 3 years ago

Sage Weil wrote:

[...]
/a/sage-2021-02-28_18:35:15-rados-wip-sage-testing-2021-02-28-1217-distro-basic-smithi/5921275

2021-02-28T19:44:47.123 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]: : error: unrecognized arguments: --container-init
2021-02-28T19:44:47.123 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]: Traceback (most recent call last):
2021-02-28T19:44:47.124 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1097, in _remote_connection
2021-02-28T19:44:47.124 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:     yield (conn, connr)
2021-02-28T19:44:47.124 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1025, in _run_cephadm
2021-02-28T19:44:47.125 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:     code, '\n'.join(err)))
2021-02-28T19:44:47.125 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]: orchestrator._interface.OrchestratorError: cephadm exited with an error code: 2, stderr:usage:  [-h] [--image IMAGE] [--docker] [--data-dir DATA_DIR]
2021-02-28T19:44:47.125 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:         [--log-dir LOG_DIR] [--logrotate-dir LOGROTATE_DIR]
2021-02-28T19:44:47.125 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:         [--unit-dir UNIT_DIR] [--verbose] [--timeout TIMEOUT] [--retry RETRY]
2021-02-28T19:44:47.126 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:         [--env ENV]
2021-02-28T19:44:47.126 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:         {version,pull,inspect-image,ls,list-networks,adopt,rm-daemon,rm-cluster,run,shell,enter,ceph-volume,unit,logs,bootstrap,deploy,check-host,prepare-host,add-repo,rm-repo,install,registry-login,gather-facts,exporter,host-maintenance,verify-prereqs}
2021-02-28T19:44:47.126 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:         ...
2021-02-28T19:44:47.127 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]: : error: unrecognized arguments: --container-init
2021-02-28T19:44:47.127 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]: cluster 2021-02-28T19:44:40.107664+0000 mgr.x (mgr.14150) 170 : cluster [DBG] pgmap v93: 64 pgs: 2 active+undersized+degraded, 62 active+undersized; 16 B data, 10 MiB used, 6.0 GiB / 6.0 GiB avail; 181 B/s rd, 362 B/s wr, 0 op/s; 2/6 objects degraded (33.333%)
2021-02-28T19:44:47.128 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]: cephadm 2021-02-28T19:44:40.116734+0000 mgr.x (mgr.14150) 171 : cephadm [INF] Deploying cephadm binary to smithi031
2021-02-28T19:44:47.128 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]: cephadm 2021-02-28T19:44:40.201713+0000 mgr.x (mgr.14150) 172 : cephadm [INF]
2021-02-28T19:44:47.128 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:  Deploying daemon cephadm-exporter.smithi031 on smithi031
2021-02-28T19:44:47.131 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]: cluster 2021-02-28T19:44:40.570640+0000
2021-02-28T19:44:47.132 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:41 smithi031 conmon[29024]:  mon.a (mon.0) 228 : cluster [INF] Health check cleared: CEPHADM_PAUSED (was: cephadm background work is paused)
2021-02-28T19:44:47.132 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]: cephadm 2021-02
2021-02-28T19:44:47.132 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]: -28T19:44:41.502027+0000 mgr.x (mgr.14150) 173 : cephadm
2021-02-28T19:44:47.133 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]: [ERR] cephadm exited with an error code: 1, stderr:This is a development version of cephadm.
2021-02-28T19:44:47.133 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]: For information regarding the latest stable release:
2021-02-28T19:44:47.134 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]:     https://docs.ceph.com/docs/pacific/cephadm/install
2021-02-28T19:44:47.134 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]: Deploy daemon cephadm-exporter.smithi031 ...
2021-02-28T19:44:47.134 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]: ERROR: config must contain the following fields : key, crt, token
2021-02-28T19:44:47.134 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]: Traceback (most recent call last):
2021-02-28T19:44:47.135 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]:   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1097, in _remote_connection
2021-02-28T19:44:47.135 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]:     yield (conn, connr)
2021-02-28T19:44:47.135 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]:   File "/usr/share/ceph/mgr/cephadm/serve.py", line 1025, in _run_cephadm
2021-02-28T19:44:47.136 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]:     code, '\n'.join(err)))
2021-02-28T19:44:47.136 INFO:tasks.workunit.client.0.smithi031.stdout:Feb 28 19:44:42 smithi031 conmon[29024]: orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:This is a development version of cephadm.

I think this is caused by one of the PRs.

Actions

Copy link

Updated by Sage Weil about 3 years ago

Status changed from New to Resolved

the problem was that the container image was being pulled from docker.io and was a day old and didn't have the needed change

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » Orchestrator

Custom queries

Bug #49539

test_cephadm.sh failure: container.alertmanager.a does'nt start

Updated by Sebastian Wagner about 3 years ago

Updated by Sebastian Wagner about 3 years ago

Updated by Sage Weil about 3 years ago