Project

General

Profile

Actions

Bug #53175

open

podman: failed to exec pid1: Exec format error: wrongly using the amd64-only digest

Added by Lukas Mayer over 2 years ago. Updated 8 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
container tools
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Introduction

I have an experimental cluster where I experiment with hosts with different Instruction Set Architectures (x86_64, aarch64) in the same cluster.

Basic Information

  • Ceph Pacific == 16.2.6
  • OpenSUSE Tumbleweed
  • Podman 3.4.1
  • Kernel 5.14.14-4.g2b5383f-default

Error description

Deploying daemons on a x86_64 host works, while on an aarch64 hosts, it fails.
The hosts are basically configured identical (same Tumbleweed Snapshot), except having a different instruction set architecture.

This issue occurs on Ceph Pacific v16.2.6, but not on v16.2.5. If trying 'ceph orch upgrade' from 16.2.5 to 16.2.6, the upgrade will work on x86_64, but not on aarch64 hosts.

Error message

The basic problem appears to be, that

2021-11-04 17:30:29,925 DEBUG Running command: /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint stat --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -e NODE_NAME=node1b54.*** -e CEPH_USE_RANDOM_NONCE=1 quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -c %u %g /var/lib/ceph
2021-11-04 17:30:32,328 DEBUG stat: ERROR (catatonit:7): failed to exec pid1: Exec format error

Some findings

If called from CLI, the above command does indeed output an error message regardless of which host it is run on:

# /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint stat --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -e NODE_NAME=mon2.*** -e CEPH_USE_RANDOM_NONCE=1 quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -c %u %g /var/lib/ceph
stat: cannot stat '%g': No such file or directory
167

I have fixed this issue by adding the necessary quotes around '%u %g'. When called directly from CLI the there is no more error. However the 'ERROR (catatonit:7): failed to exec pid1' remains (on aarch64), if cephadm tries to deploy a crash daemon when calling 'ceph orch host add ...'.

The Cephadm logs on the x86_64 host show, that the command is working with cephadm despite of the quotes:

2021-11-04 23:55:25,667 DEBUG Running command: /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint stat --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -e NODE_NAME=mon2.*** -e CEPH_USE_RANDOM_NONCE=1 quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -c %u %g /var/lib/ceph
2021-11-04 23:55:26,832 DEBUG stat: 167 167

Full Error message

On the aarch64 host, it shows:

2021-11-04 17:30:29,924 INFO Deploy daemon crash.node1b54 ...
2021-11-04 17:30:29,925 DEBUG Running command: /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint stat --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -e NODE_NAME=node1b54.*** -e CEPH_USE_RANDOM_NONCE=1 quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -c %u %g /var/lib/ceph
2021-11-04 17:30:32,328 DEBUG stat: ERROR (catatonit:7): failed to exec pid1: Exec format error
2021-11-04 17:30:33,625 INFO Non-zero exit code 1 from /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint stat --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -e NODE_NAME=node1b54.*** -e CEPH_USE_RANDOM_NONCE=1 quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -c %u %g /var/lib/ceph
2021-11-04 17:30:33,625 INFO stat: stderr ERROR (catatonit:7): failed to exec pid1: Exec format error
2021-11-04 17:31:34,140 DEBUG Running command: /usr/bin/podman version --format {{.Client.Version}}
2021-11-04 17:31:34,265 DEBUG /usr/bin/podman: 3.4.1
2021-11-04 17:31:34,339 DEBUG Running command: /usr/bin/podman stats --format {{.ID}},{{.MemUsage}} --no-stream
2021-11-04 17:31:34,536 DEBUG Using specified fsid: 81cd0e70-3d89-11ec-977f-0390d6773ed2
2021-11-04 17:31:34,537 DEBUG Running command: /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint stat --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -e NODE_NAME=node1b54.*** -e CEPH_USE_RANDOM_NONCE=1 quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -c %u %g /var/lib/ceph
2021-11-04 17:31:36,729 DEBUG stat: ERROR (catatonit:7): failed to exec pid1: Exec format error
2021-11-04 17:31:37,939 INFO Non-zero exit code 1 from /usr/bin/podman run --rm --ipc=host --stop-signal=SIGTERM --net=host --entrypoint stat --init -e CONTAINER_IMAGE=quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -e NODE_NAME=node1b54.*** -e CEPH_USE_RANDOM_NONCE=1 quay.io/ceph/ceph@sha256:285d2abb6e74bdc6e15e1af585aa19f132045651b7a80eb77b9cec8a785ff330 -c %u %g /var/lib/ceph
2021-11-04 17:31:37,939 INFO stat: stderr ERROR (catatonit:7): failed to exec pid1: Exec format error
2021-11-04 17:31:38,433 DEBUG Running command: /usr/bin/podman version --format {{.Client.Version}}
2021-11-04 17:31:38,555 DEBUG /usr/bin/podman: 3.4.1
2021-11-04 17:31:38,661 DEBUG Running command: sysctl -a
2021-11-04 17:31:38,725 DEBUG Running command: sysctl -a
2021-11-04 17:31:38,788 DEBUG Running command: sestatus
2021-11-04 17:31:38,795 DEBUG sestatus: SELinux status:                 enabled
2021-11-04 17:31:38,795 DEBUG sestatus: SELinuxfs mount:                /sys/fs/selinux
2021-11-04 17:31:38,796 DEBUG sestatus: SELinux root directory:         /etc/selinux
2021-11-04 17:31:38,796 DEBUG sestatus: Loaded policy name:             targeted
2021-11-04 17:31:38,796 DEBUG sestatus: Current mode:                   enforcing
2021-11-04 17:31:38,796 DEBUG sestatus: Mode from config file:          enforcing
2021-11-04 17:31:38,796 DEBUG sestatus: Policy MLS status:              enabled
2021-11-04 17:31:38,796 DEBUG sestatus: Policy deny_unknown status:     allowed
2021-11-04 17:31:38,796 DEBUG sestatus: Memory protection checking:     requested (insecure)
2021-11-04 17:31:38,797 DEBUG sestatus: Max kernel policy version:      33
2021-11-04 17:31:38,798 DEBUG Running command: sestatus
2021-11-04 17:31:38,805 DEBUG sestatus: SELinux status:                 enabled
2021-11-04 17:31:38,805 DEBUG sestatus: SELinuxfs mount:                /sys/fs/selinux
2021-11-04 17:31:38,805 DEBUG sestatus: SELinux root directory:         /etc/selinux
2021-11-04 17:31:38,805 DEBUG sestatus: Loaded policy name:             targeted
2021-11-04 17:31:38,806 DEBUG sestatus: Current mode:                   enforcing
2021-11-04 17:31:38,806 DEBUG sestatus: Mode from config file:          enforcing
2021-11-04 17:31:38,806 DEBUG sestatus: Policy MLS status:              enabled
2021-11-04 17:31:38,806 DEBUG sestatus: Policy deny_unknown status:     allowed
2021-11-04 17:31:38,806 DEBUG sestatus: Memory protection checking:     requested (insecure)
2021-11-04 17:31:38,806 DEBUG sestatus: Max kernel policy version:      33
2021-11-04 17:31:38,808 DEBUG Running command: sestatus
2021-11-04 17:31:38,815 DEBUG sestatus: SELinux status:                 enabled
2021-11-04 17:31:38,816 DEBUG sestatus: SELinuxfs mount:                /sys/fs/selinux
2021-11-04 17:31:38,816 DEBUG sestatus: SELinux root directory:         /etc/selinux
2021-11-04 17:31:38,816 DEBUG sestatus: Loaded policy name:             targeted
2021-11-04 17:31:38,816 DEBUG sestatus: Current mode:                   enforcing
2021-11-04 17:31:38,816 DEBUG sestatus: Mode from config file:          enforcing
2021-11-04 17:31:38,816 DEBUG sestatus: Policy MLS status:              enabled
2021-11-04 17:31:38,816 DEBUG sestatus: Policy deny_unknown status:     allowed
2021-11-04 17:31:38,817 DEBUG sestatus: Memory protection checking:     requested (insecure)
2021-11-04 17:31:38,817 DEBUG sestatus: Max kernel policy version:      33
2021-11-04 17:31:38,818 DEBUG Running command: sestatus
2021-11-04 17:31:38,825 DEBUG sestatus: SELinux status:                 enabled
2021-11-04 17:31:38,825 DEBUG sestatus: SELinuxfs mount:                /sys/fs/selinux
2021-11-04 17:31:38,825 DEBUG sestatus: SELinux root directory:         /etc/selinux
2021-11-04 17:31:38,825 DEBUG sestatus: Loaded policy name:             targeted
2021-11-04 17:31:38,825 DEBUG sestatus: Current mode:                   enforcing
2021-11-04 17:31:38,826 DEBUG sestatus: Mode from config file:          enforcing
2021-11-04 17:31:38,826 DEBUG sestatus: Policy MLS status:              enabled
2021-11-04 17:31:38,826 DEBUG sestatus: Policy deny_unknown status:     allowed
2021-11-04 17:31:38,826 DEBUG sestatus: Memory protection checking:     requested (insecure)
2021-11-04 17:31:38,826 DEBUG sestatus: Max kernel policy version:      33
2021-11-04 17:31:38,827 DEBUG Running command: sestatus
2021-11-04 17:31:38,834 DEBUG sestatus: SELinux status:                 enabled
2021-11-04 17:31:38,835 DEBUG sestatus: SELinuxfs mount:                /sys/fs/selinux
2021-11-04 17:31:38,835 DEBUG sestatus: SELinux root directory:         /etc/selinux
2021-11-04 17:31:38,835 DEBUG sestatus: Loaded policy name:             targeted
2021-11-04 17:31:38,835 DEBUG sestatus: Current mode:                   enforcing
2021-11-04 17:31:38,835 DEBUG sestatus: Mode from config file:          enforcing
2021-11-04 17:31:38,836 DEBUG sestatus: Policy MLS status:              enabled
2021-11-04 17:31:38,836 DEBUG sestatus: Policy deny_unknown status:     allowed
2021-11-04 17:31:38,836 DEBUG sestatus: Memory protection checking:     requested (insecure)
2021-11-04 17:31:38,836 DEBUG sestatus: Max kernel policy version:      33
2021-11-04 17:31:38,837 DEBUG Running command: sestatus
2021-11-04 17:31:38,844 DEBUG sestatus: SELinux status:                 enabled
2021-11-04 17:31:38,844 DEBUG sestatus: SELinuxfs mount:                /sys/fs/selinux
2021-11-04 17:31:38,845 DEBUG sestatus: SELinux root directory:         /etc/selinux
2021-11-04 17:31:38,845 DEBUG sestatus: Loaded policy name:             targeted
2021-11-04 17:31:38,845 DEBUG sestatus: Current mode:                   enforcing
2021-11-04 17:31:38,845 DEBUG sestatus: Mode from config file:          enforcing
2021-11-04 17:31:38,845 DEBUG sestatus: Policy MLS status:              enabled
2021-11-04 17:31:38,845 DEBUG sestatus: Policy deny_unknown status:     allowed
2021-11-04 17:31:38,845 DEBUG sestatus: Memory protection checking:     requested (insecure)
2021-11-04 17:31:38,846 DEBUG sestatus: Max kernel policy version:      33
2021-11-04 17:31:39,418 DEBUG Running command: /usr/bin/podman version --format {{.Client.Version}}
2021-11-04 17:31:39,543 DEBUG /usr/bin/podman: 3.4.1
2021-11-04 17:31:39,616 DEBUG Acquiring lock 281473443482160 on /run/cephadm/81cd0e70-3d89-11ec-977f-0390d6773ed2.lock
2021-11-04 17:31:39,616 DEBUG Lock 281473443482160 acquired on /run/cephadm/81cd0e70-3d89-11ec-977f-0390d6773ed2.lock
2021-11-04 17:31:39,616 DEBUG Running command: systemctl is-enabled ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2@crash.node1b54
2021-11-04 17:31:39,633 DEBUG systemctl: Failed to get unit file state for ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2@crash.node1b54.service: No such file or directory
2021-11-04 17:31:39,635 DEBUG Running command: systemctl is-active ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2@crash.node1b54
2021-11-04 17:31:39,651 DEBUG systemctl: inactive
2021-11-04 17:31:39,653 DEBUG Running command: /usr/bin/podman container inspect --format {{.State.Status}} ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2-crash-node1b54
2021-11-04 17:31:39,778 DEBUG /usr/bin/podman: Error: error inspecting object: no such container ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2-crash-node1b54
2021-11-04 17:31:39,785 INFO Non-zero exit code 125 from /usr/bin/podman container inspect --format {{.State.Status}} ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2-crash-node1b54
2021-11-04 17:31:39,786 INFO /usr/bin/podman: stderr Error: error inspecting object: no such container ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2-crash-node1b54
2021-11-04 17:31:39,786 DEBUG Running command: /usr/bin/podman container inspect --format {{.State.Status}} ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2-crash.node1b54
2021-11-04 17:31:39,910 DEBUG /usr/bin/podman: Error: error inspecting object: no such container ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2-crash.node1b54
2021-11-04 17:31:39,917 INFO Non-zero exit code 125 from /usr/bin/podman container inspect --format {{.State.Status}} ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2-crash.node1b54
2021-11-04 17:31:39,917 INFO /usr/bin/podman: stderr Error: erro inspecting object: no such container ceph-81cd0e70-3d89-11ec-977f-0390d6773ed2-crash.node1b54

Final thoughts

This is most certainly worth to investigate.

But it is weird enough that identically set up hosts must have started to behave differently somewhere between Ceph v.16.2.5 → v16.2.6.

Looking at git diff v16.2.5 v16.2.6 -- src/cephadm/cephadm I was unable to identify an obvious cause (yet).

If you have further questions or need me to test something in particular, please ask.


Related issues 2 (0 open2 closed)

Related to Orchestrator - Bug #49232: standard_init_linux.go:211: exec user process caused "exec format error"Can't reproduce

Actions
Related to Orchestrator - Bug #44990: cephadm: exec: "/usr/bin/ceph-mon": stat /usr/bin/ceph-mon: no such file or directoryCan't reproduceSebastian Wagner

Actions
Actions

Also available in: Atom PDF