Project

General

Profile

Actions

Bug #47916

closed

podman containers running in a detached state do not output logs to journald

Added by Michael Fritch over 3 years ago. Updated about 3 years ago.

Status:
Resolved
Priority:
Normal
Category:
cephadm (binary)
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

When a service has failed, it can be difficult to determine the actual cause:

# cephadm logs --name nfs.bar.host1 -- -n 50 
Inferring fsid 9cfc6bda-89b5-413b-830f-01c8ddad63ec
-- Logs begin at Sat 2020-10-03 13:38:32 MDT, end at Tue 2020-10-20 11:04:26 MDT. --
Oct 14 08:19:10 host1 bash[11662]: Error: Failed to evict container: "": Failed to find container "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-remove" in state: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-remove found: no such container
Oct 14 08:19:11 host1 bash[11704]: Error: no container with ID or name "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-remove" found: no such container
Oct 14 08:19:26 host1 systemd[1]: ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec@nfs.bar.host1.service: Failed with result 'exit-code'.
Oct 14 08:19:36 host1 systemd[1]: ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec@nfs.bar.host1.service: Scheduled restart job, restart counter is at 2. 
Oct 14 08:19:36 host1 systemd[1]: Stopped Ceph nfs.bar.host1 for 9cfc6bda-89b5-413b-830f-01c8ddad63ec.
Oct 14 08:19:36 host1 systemd[1]: Starting Ceph nfs.bar.host1 for 9cfc6bda-89b5-413b-830f-01c8ddad63ec...
Oct 14 08:19:36 host1 podman[11911]: Error: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1 found: no such container
Oct 14 08:19:36 host1 bash[11955]: Error: Failed to evict container: "": Failed to find container "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-add" in state: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-add found: no such container
Oct 14 08:19:36 host1 bash[11996]: Error: no container with ID or name "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-add" found: no such container
Oct 14 08:19:48 host1 bash[12197]: Error: Failed to evict container: "": Failed to find container "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1" in state: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1 found: no such container
Oct 14 08:19:48 host1 bash[12239]: Error: no container with ID or name "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1" found: no such container
Oct 14 08:19:54 host1 bash[12285]: 23af0d3df127d4c29c1ea37ba74f673999ec1770abacf4eb65b8cf3a5b4333c3
Oct 14 08:19:54 host1 systemd[1]: Started Ceph nfs.bar.host1 for 9cfc6bda-89b5-413b-830f-01c8ddad63ec.
Oct 14 08:20:06 host1 systemd[1]: ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec@nfs.bar.host1.service: Main process exited, code=exited, status=1/FAILURE
Oct 14 08:20:06 host1 bash[12482]: Error: Failed to evict container: "": Failed to find container "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-remove" in state: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-remove found: no such container
Oct 14 08:20:06 host1 bash[12527]: Error: no container with ID or name "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-remove" found: no such container
Oct 14 08:20:19 host1 systemd[1]: ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec@nfs.bar.host1.service: Failed with result 'exit-code'.
Oct 14 08:20:29 host1 systemd[1]: ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec@nfs.bar.host1.service: Scheduled restart job, restart counter is at 3. 
Oct 14 08:20:29 host1 systemd[1]: Stopped Ceph nfs.bar.host1 for 9cfc6bda-89b5-413b-830f-01c8ddad63ec.
Oct 14 08:20:29 host1 systemd[1]: Starting Ceph nfs.bar.host1 for 9cfc6bda-89b5-413b-830f-01c8ddad63ec...
Oct 14 08:20:29 host1 podman[12744]: Error: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1 found: no such container
Oct 14 08:20:29 host1 bash[12788]: Error: Failed to evict container: "": Failed to find container "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-add" in state: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-add found: no such container
Oct 14 08:20:30 host1 bash[12837]: Error: no container with ID or name "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-add" found: no such container
Oct 14 08:20:38 host1 bash[13035]: Error: Failed to evict container: "": Failed to find container "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1" in state: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1 found: no such container
Oct 14 08:20:38 host1 bash[13077]: Error: no container with ID or name "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1" found: no such container
Oct 14 08:20:44 host1 bash[13119]: 995fbda5de8426d8c0a54aa76d54f4689ffa87c501659f241d814fe2d7ae015a
Oct 14 08:20:44 host1 systemd[1]: Started Ceph nfs.bar.host1 for 9cfc6bda-89b5-413b-830f-01c8ddad63ec.
Oct 14 08:20:54 host1 systemd[1]: ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec@nfs.bar.host1.service: Main process exited, code=exited, status=1/FAILURE


Related issues 1 (0 open1 closed)

Related to Orchestrator - Bug #49551: cephadm journald logs are mangledResolved

Actions
Actions #1

Updated by Michael Fritch over 3 years ago

Configuring journald as the log driver allows for the conmon logs to show the actual output from the failed daemon container.

# cephadm logs --name nfs.bar.host1 -- -n 30 
Inferring fsid 9cfc6bda-89b5-413b-830f-01c8ddad63ec
-- Logs begin at Sat 2020-10-03 13:38:32 MDT, end at Tue 2020-10-20 11:07:23 MDT. --
Oct 14 08:22:14 host1 systemd[1]: Stopped Ceph nfs.bar.host1 for 9cfc6bda-89b5-413b-830f-01c8ddad63ec.
Oct 14 08:22:14 host1 systemd[1]: ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec@nfs.bar.host1.service: Start request repeated too quickly.
Oct 14 08:22:14 host1 systemd[1]: ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec@nfs.bar.host1.service: Failed with result 'exit-code'.
Oct 14 08:22:14 host1 systemd[1]: Failed to start Ceph nfs.bar.host1 for 9cfc6bda-89b5-413b-830f-01c8ddad63ec.
Oct 20 11:04:59 host1 systemd[1]: Starting Ceph nfs.bar.host1 for 9cfc6bda-89b5-413b-830f-01c8ddad63ec...
Oct 20 11:04:59 host1 podman[16752]: Error: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1 found: no such container
Oct 20 11:04:59 host1 bash[16798]: Error: Failed to evict container: "": Failed to find container "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-add" in state: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-add found: no such container
Oct 20 11:04:59 host1 bash[16842]: Error: no container with ID or name "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-add" found: no such container
Oct 20 11:05:38 host1 bash[17068]: Error: Failed to evict container: "": Failed to find container "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1" in state: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1 found: no such container
Oct 20 11:05:38 host1 bash[17113]: Error: no container with ID or name "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1" found: no such container
Oct 20 11:05:53 host1 bash[17158]: d72d1dd29626b9970460937cfbdf15bb4c9cd0040346e4241dd63ba44401eb02
Oct 20 11:05:53 host1 systemd[1]: Started Ceph nfs.bar.host1 for 9cfc6bda-89b5-413b-830f-01c8ddad63ec.
Oct 20 11:05:55 host1 conmon[17255]: 20/10/2020 17:05:55 : epoch 5f8f18f3 : host1 : ganesha.nfsd-1[main] main :MAIN :EVENT :ganesha.nfsd Starting: Ganesha Version 3.3
Oct 20 11:06:05 host1 conmon[17255]: 20/10/2020 17:06:05 : epoch 5f8f18f3 : host1 : ganesha.nfsd-1[main] nfs_set_param_from_conf :NFS STARTUP :EVENT :Configuration file successfully parsed
Oct 20 11:06:05 host1 conmon[17255]: 20/10/2020 17:06:05 : epoch 5f8f18f3 : host1 : ganesha.nfsd-1[main] init_server_pkgs :NFS STARTUP :EVENT :Initializing ID Mapper.
Oct 20 11:06:06 host1 conmon[17255]: 20/10/2020 17:06:06 : epoch 5f8f18f3 : host1 : ganesha.nfsd-1[main] init_server_pkgs :NFS STARTUP :EVENT :ID Mapper successfully initialized.
Oct 20 11:06:11 host1 conmon[17255]: 20/10/2020 17:06:11 : epoch 5f8f18f3 : host1 : ganesha.nfsd-1[main] nfs_start_grace :STATE :EVENT :NFS Server Now IN GRACE, duration 90 
Oct 20 11:06:33 host1 conmon[17255]: 20/10/2020 17:06:33 : epoch 5f8f18f3 : host1 : ganesha.nfsd-1[main] nfs_lift_grace_locked :STATE :EVENT :NFS Server Now NOT IN GRACE
Oct 20 11:06:36 host1 conmon[17255]: 2020-10-20T17:06:36.141+0000 7f878de25400 -1 auth: unable to find a keyring on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or directory
Oct 20 11:06:36 host1 conmon[17255]: 2020-10-20T17:06:36.141+0000 7f878de25400 -1 AuthRegistry(0x55897c3c5290) no keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling cephx
Oct 20 11:06:36 host1 conmon[17255]: 2020-10-20T17:06:36.141+0000 7f878de25400 -1 auth: unable to find a keyring on /var/lib/ceph/radosgw/ceph-admin/keyring: (2) No such file or directory
Oct 20 11:06:36 host1 conmon[17255]: 2020-10-20T17:06:36.141+0000 7f878de25400 -1 AuthRegistry(0x7fffe7f93160) no keyring found at /var/lib/ceph/radosgw/ceph-admin/keyring, disabling cephx
Oct 20 11:06:36 host1 conmon[17255]: 2020-10-20T17:06:36.145+0000 7f8723fff700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]
Oct 20 11:06:36 host1 conmon[17255]: 2020-10-20T17:06:36.145+0000 7f8722ffd700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]
Oct 20 11:06:36 host1 conmon[17255]: 2020-10-20T17:06:36.145+0000 7f87237fe700 -1 monclient(hunting): handle_auth_bad_method server allowed_methods [2] but i only support [1]
Oct 20 11:06:36 host1 conmon[17255]: 2020-10-20T17:06:36.145+0000 7f878de25400 -1 monclient: authenticate NOTE: no keyring found; disabled cephx authentication
Oct 20 11:06:36 host1 conmon[17255]: failed to fetch mon config (--no-mon-config to skip)
Oct 20 11:06:54 host1 systemd[1]: ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec@nfs.bar.host1.service: Main process exited, code=exited, status=1/FAILURE
Oct 20 11:06:54 host1 bash[17441]: Error: Failed to evict container: "": Failed to find container "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-remove" in state: no container with name or ID ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-remove found: no such container
Oct 20 11:06:55 host1 bash[17487]: Error: no container with ID or name "ceph-9cfc6bda-89b5-413b-830f-01c8ddad63ec-nfs.bar.host1-grace-remove" found: no such container
Actions #2

Updated by Sebastian Wagner about 3 years ago

  • Status changed from New to Pending Backport
Actions #3

Updated by Sebastian Wagner about 3 years ago

  • Related to Bug #49551: cephadm journald logs are mangled added
Actions #4

Updated by Sebastian Wagner about 3 years ago

how can we fix this for octopus?

Actions #5

Updated by Sebastian Wagner about 3 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF