Bug #50441
closedcephadm bootstrap on arm64 fails to start ceph/ceph-grafana service
100%
Description
Hello,
I installed a new Ceph 15.2.10 cluster on Ubuntu 20.04 arm64 bare metal starting with a first monitor/manager node using the new "cephadm bootstrap" tool using the following command:
cephadm bootstrap --mon-ip 192.168.1.11
but unfortunately the grafana service is not working at all. It tries to restart the ceph/ceph-grafana container every 10 minutes but fails to do so because it looks like there is no arm64 version of this container as you can see from the logs below:
Traceback (most recent call last): File "/usr/share/ceph/mgr/cephadm/module.py", line 1021, in _remote_connection yield (conn, connr) File "/usr/share/ceph/mgr/cephadm/module.py", line 1168, in _run_cephadm code, '\n'.join(err))) orchestrator._interface.OrchestratorError: cephadm exited with an error code: 1, stderr:Deploy daemon grafana.ceph1a ... Non-zero exit code 1 from /usr/bin/podman run --rm --ipc=host --net=host --entrypoint stat -e CONTAINER_IMAGE=docker.io/ceph/ceph-grafana:6.7.4 -e NODE_NAME=ceph1a docker.io/ceph/ceph-grafana:6.7.4 -c %u %g /var/lib/grafana stat: stderr {"msg":"exec container process `/usr/bin/stat`: Exec format error","level":"error","time":"2021-04-09T06:17:54.000910863Z"} Traceback (most recent call last): File "<stdin>", line 6153, in <module> File "<stdin>", line 1412, in _default_image File "<stdin>", line 3431, in command_deploy File "<stdin>", line 3362, in extract_uid_gid_monitoring File "<stdin>", line 2099, in extract_uid_gid RuntimeError: uid/gid not found
So I see two options here:
1) provide an arm64 docker image for the ceph/ceph-grafana container (preferred)
2) check for arm64 arch and do not deploy the grafana service on this architecture until 1) is fixed
I think it is a real win for Ceph to fully work on arm64 architecture, so it would be great if this could be taken care of. In case you need more details or more log data do not hesitate to contact me.
Thank you very much in advance.