Project

General

Profile

Actions

Bug #44272

closed

on SUSE, crash daemon starts but then always stops a couple minutes later

Added by Nathan Cutler about 4 years ago. Updated about 4 years ago.

Status:
Resolved
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Recently cephadm/orchestrator started deploying crash daemon on all cluster nodes.

On SUSE (at least), the crash daemon does not stay up for long. After some minutes, it always stops. Journalctl has this to say about it:

# journalctl -u "ceph-899b6a04-5715-11ea-9d8c-525400f299cb@crash.admin.service" | head
-- Logs begin at Mon 2020-02-24 15:47:30 CET, end at Mon 2020-02-24 16:18:38 CET. --
Feb 24 15:54:29 admin systemd[1]: Starting Ceph crash.admin for 899b6a04-5715-11ea-9d8c-525400f299cb...
Feb 24 15:54:29 admin podman[15929]: Error: no container with name or ID ceph-899b6a04-5715-11ea-9d8c-525400f299cb-crash.admin found: no such container
Feb 24 15:54:29 admin systemd[1]: Started Ceph crash.admin for 899b6a04-5715-11ea-9d8c-525400f299cb.
Feb 24 15:54:30 admin bash[15941]: INFO:ceph-crash:monitoring path /var/lib/ceph/crash, delay 600s
Feb 24 16:00:16 admin systemd[1]: Stopping Ceph crash.admin for 899b6a04-5715-11ea-9d8c-525400f299cb...
Feb 24 16:00:16 admin podman[20703]: time="2020-02-24T16:00:16+01:00" level=error msg="container_linux.go:389: signaling init process caused \"permission denied\"" 
Feb 24 16:00:16 admin podman[20703]: container_linux.go:389: signaling init process caused "permission denied" 
Feb 24 16:00:16 admin podman[20703]: Error: permission denied
Feb 24 16:00:31 admin systemd[1]: ceph-899b6a04-5715-11ea-9d8c-525400f299cb@crash.admin.service: State 'stop-post' timed out. Terminating.

Files

dmesg.out (35.9 KB) dmesg.out Nathan Cutler, 03/05/2020 03:43 PM
Actions

Also available in: Atom PDF