Bug #43618: cephadm: logs doesn't include logs of failed daemons - Orchestrator - Ceph

Actions

Copy link

Bug #43618

closed

cephadm: logs doesn't include logs of failed daemons

Added by Sebastian Wagner over 4 years ago. Updated about 4 years ago.

Status:

Resolved

Priority:

Normal

Assignee:

Michael Fritch

Category:

cephadm (binary)

Target version:

Ceph - v15.0.0

% Done:

Source:

Development

Tags:

cephadm

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

32752

Crash signature (v1):

Crash signature (v2):

Description

cephadm logs fails:

# cephadm logs --name mon.node1
INFO:cephadm:Inferring fsid bdda7ef0-385b-11ea-bf04-5254003520f9
Error: no container with name or ID ceph-bdda7ef0-385b-11ea-bf04-5254003520f9-mon.node1 found: no such container

However, the unit contains proper logs:

# systemctl status ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service
● ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service - Ceph daemon for bdda7ef0-385b-11ea-bf04-5254003520f9
   Loaded: loaded (/etc/systemd/system/ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@.service; enabled; vendor preset: disabled)
   Active: failed (Result: exit-code) since Thu 2020-01-16 13:31:22 CET; 47min ago
  Process: 17300 ExecStopPost=/bin/bash /var/lib/ceph/bdda7ef0-385b-11ea-bf04-5254003520f9/mon.node1/unit.poststop (code=exited, status=0/SUCCESS)
  Process: 17159 ExecStart=/bin/bash /var/lib/ceph/bdda7ef0-385b-11ea-bf04-5254003520f9/mon.node1/unit.run (code=exited, status=1/FAILURE)
  Process: 17158 ExecStartPre=/usr/bin/install -d -m0770 -o 167 -g 167 /var/run/ceph/bdda7ef0-385b-11ea-bf04-5254003520f9 (code=exited, status=0/SUCCESS)
  Process: 17149 ExecStartPre=/usr/bin/podman rm ceph-bdda7ef0-385b-11ea-bf04-5254003520f9-mon.node1 (code=exited, status=1/FAILURE)
 Main PID: 17159 (code=exited, status=1/FAILURE)

Jan 16 13:31:12 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Unit entered failed state.
Jan 16 13:31:12 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Failed with result 'exit-code'.
Jan 16 13:31:22 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Service RestartSec=10s expired, scheduling restart.
Jan 16 13:31:22 node1 systemd[1]: Stopped Ceph daemon for bdda7ef0-385b-11ea-bf04-5254003520f9.
Jan 16 13:31:22 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Start request repeated too quickly.
Jan 16 13:31:22 node1 systemd[1]: Failed to start Ceph daemon for bdda7ef0-385b-11ea-bf04-5254003520f9.
Jan 16 13:31:22 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Unit entered failed state.
Jan 16 13:31:22 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Failed with result 'exit-code'.

would probably make sense to use journalctl to fetch the logs?

# journalctl -u ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service | tail -n 12
Jan 16 13:31:12 node1 bash[17159]: 2020-01-16T12:31:12.548+0000 7f8d85bc8640  4 rocksdb: DB pointer 0x55eed6507200
Jan 16 13:31:12 node1 bash[17159]: 2020-01-16T12:31:12.548+0000 7f8d85bc8640  0 mon.node1 does not exist in monmap, will attempt to join an existing cluster
Jan 16 13:31:12 node1 bash[17159]: 2020-01-16T12:31:12.548+0000 7f8d85bc8640 -1 no public_addr or public_network specified, and mon.node1 not present in monmap or ceph.conf
Jan 16 13:31:12 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Main process exited, code=exited, status=1/FAILURE
Jan 16 13:31:12 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Unit entered failed state.
Jan 16 13:31:12 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Failed with result 'exit-code'.
Jan 16 13:31:22 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Service RestartSec=10s expired, scheduling restart.
Jan 16 13:31:22 node1 systemd[1]: Stopped Ceph daemon for bdda7ef0-385b-11ea-bf04-5254003520f9.
Jan 16 13:31:22 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Start request repeated too quickly.
Jan 16 13:31:22 node1 systemd[1]: Failed to start Ceph daemon for bdda7ef0-385b-11ea-bf04-5254003520f9.
Jan 16 13:31:22 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Unit entered failed state.
Jan 16 13:31:22 node1 systemd[1]: ceph-bdda7ef0-385b-11ea-bf04-5254003520f9@mon.node1.service: Failed with result 'exit-code'.