Actions
Bug #50295
closedcephadm bootstrap mon container fails to start with podman 3.1 in CentOS 8 Stream
Status:
Closed
Priority:
Normal
Assignee:
Category:
cephadm
Target version:
% Done:
0%
Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
When attempting to bootstrap a container on CentOS stream after Appstream changed from podman version 3.0.0-0.33rc2.module_el8.4.0+673+eabfc99d to 3.1.0-0.13.module_el8.5.0+733+9bb5dffa the initial mon container fails to start under systemd due to 'unknown capability "CAP_PERFMON"'
Tested with pacific and octopus builds on a minimal install of CentOS Stream running in KVM VMs.
Workaround was to versionlock podman to 3.0 using python3-dnf-plugin-versionlock on an older release of CentOS Stream prior to applying updates.
[root@ceph01 danrp]# uname -a
Linux ceph01 4.18.0-294.el8.x86_64 #1 SMP Mon Mar 15 22:38:42 UTC 2021 x86_64 x86_64 x86_64 GNU/Linux
[root@ceph01 danrp]# podman version
Version: 3.1.0-dev
API Version: 3.1.0-dev
Go Version: go1.16.1
Built: Fri Mar 26 18:32:03 2021
OS/Arch: linux/amd64
[root@ceph01 danrp]# curl --silent --remote-name --location https://github.com/ceph/ceph/raw/pacific/src/cephadm/cephadm
[root@ceph01 danrp]# chmod +x cephadm
[root@ceph01 danrp]# ./cephadm bootstrap --mon-ip 10.10.4.14
Creating directory /etc/ceph for ceph.conf
Verifying podman|docker is present...
Verifying lvm2 is present...
Verifying time synchronization is in place...
Unit chronyd.service is enabled and running
Repeating the final host check...
podman|docker (/bin/podman) is present
systemctl is present
lvcreate is present
Unit chronyd.service is enabled and running
Host looks OK
Cluster fsid: 41cdf9dc-9876-11eb-a66f-5254000fb543
Verifying IP 10.10.4.14 port 3300 ...
Verifying IP 10.10.4.14 port 6789 ...
Mon IP 10.10.4.14 is in CIDR network 10.10.4.0/24
- internal network (--cluster-network) has not been provided, OSD replication will default to the public_network
Pulling container image docker.io/ceph/ceph:v16...
Ceph version: ceph version 16.2.0 (0c2054e95bcd9b30fdd908a79ac1d8bbc3394442) pacific (stable)
Extracting ceph user uid/gid from container image...
Creating initial keys...
Creating initial monmap...
Creating mon...
Non-zero exit code 1 from systemctl start ceph-41cdf9dc-9876-11eb-a66f-5254000fb543@mon.ceph01
systemctl: stderr Job for ceph-41cdf9dc-9876-11eb-a66f-5254000fb543@mon.ceph01.service failed because the control process exited with error code.
systemctl: stderr See "systemctl status ceph-41cdf9dc-9876-11eb-a66f-5254000fb543@mon.ceph01.service" and "journalctl -xe" for details.
Traceback (most recent call last):
File "./cephadm", line 7924, in <module>
main()
File "./cephadm", line 7912, in main
r = ctx.func(ctx)
File "./cephadm", line 1717, in _default_image
return func(ctx)
File "./cephadm", line 3909, in command_bootstrap
create_mon(ctx, uid, gid, fsid, mon_id)
File "./cephadm", line 3536, in create_mon
config=None, keyring=None)
File "./cephadm", line 2561, in deploy_daemon
c, osd_fsid=osd_fsid, ports=ports)
File "./cephadm", line 2757, in deploy_daemon_units
call_throws(ctx, ['systemctl', 'start', unit_name])
File "./cephadm", line 1411, in call_throws
raise RuntimeError('Failed command: %s' % ' '.join(command))
RuntimeError: Failed command: systemctl start ceph-41cdf9dc-9876-11eb-a66f-5254000fb543@mon.ceph01
[root@ceph01 danrp]# journalctl -lef -u ceph-41cdf9dc-9876-11eb-a66f-5254000fb543@mon.ceph01
-- Logs begin at Thu 2021-04-08 15:21:42 BST. --
Apr 08 15:26:15 ceph01 systemd[1]: Starting Ceph mon.ceph01 for 41cdf9dc-9876-11eb-a66f-5254000fb543...
Apr 08 15:26:16 ceph01 bash[2432]: Error: OCI runtime error: container_linux.go:370: starting container process caused: unknown capability "CAP_PERFMON"
Apr 08 15:26:16 ceph01 systemd[1]: ceph-41cdf9dc-9876-11eb-a66f-5254000fb543@mon.ceph01.service: Control process exited, code=exited status=126
Apr 08 15:26:16 ceph01 systemd[1]: ceph-41cdf9dc-9876-11eb-a66f-5254000fb543@mon.ceph01.service: Failed with result 'exit-code'.
Apr 08 15:26:16 ceph01 systemd[1]: Failed to start Ceph mon.ceph01 for 41cdf9dc-9876-11eb-a66f-5254000fb543.
Actions