Bug #54373
closed
cephadm shell is not creating a container with correct container image
Added by Vikhyat Umrao about 2 years ago.
Updated about 2 years ago.
Description
- The cephadm shell method still showing the pacific version, so looks like it is not creating the right container env with the correct image.
[root@f03-h10-000-r640 ~]# cephadm shell --name osd.2
Inferring fsid 582b49e8-8fa4-11ec-a181-bc97e17cf100
Inferring config /var/lib/ceph/582b49e8-8fa4-11ec-a181-bc97e17cf100/osd.2/config
Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:03ccec263a507bbf07ec9b25482abf4eabdeffa591af28bf4dba2c15d8db1d5b
[ceph: root@f03-h10-000-r640 /]# osdmaptool --version
ceph version 16.2.7-65.el8cp (499bc1e23ab4671631da5affff6e1c772b8fe42d) pacific (stable)
[ceph: root@f03-h10-000-r640 /]# rpm -qa |grep ceph-osd
ceph-osd-16.2.7-65.el8cp.x86_64
[ceph: root@f03-h10-000-r640 /]#
- This podman exec method is giving correct results as it is using existing osd container
[root@f03-h10-000-r640 ~]# podman ps | grep -w osd.2
1e203360f300 quay.ceph.io/ceph-ci/ceph@sha256:896874e2b54827a0040bd75f072f40be1114403da0a764eed2e606025e9e557c -n osd.2 -f --set... 45 hours ago Up 45 hours ago ceph-582b49e8-8fa4-11ec-a181-bc97e17cf100-osd-2
[root@f03-h10-000-r640 ~]# podman exec -ti 1e203360f300 /bin/bash
[root@f03-h10-000-r640 /]# osdmaptool --version
ceph version 17.0.0-10315-ga00e8b31 (a00e8b315af02865380634f8100dc7d18a18af4f) quincy (dev)
[root@f03-h10-000-r640 /]# rpm -qa | grep ceph-osd
ceph-osd-17.0.0-10315.ga00e8b31.el8.x86_64
[root@f03-h10-000-r640 /]#
- Cluster is fully upgraded to quincy but still `cephadm shell` is creating old version container.
# ceph versions
{
"mon": {
"ceph version 17.1.0-31-g1ccf6db7 (1ccf6db7f29f65466de9f1a9c2001ac8d9c8d2ab) quincy (dev)": 3
},
"mgr": {
"ceph version 17.1.0-31-g1ccf6db7 (1ccf6db7f29f65466de9f1a9c2001ac8d9c8d2ab) quincy (dev)": 3
},
"osd": {
"ceph version 17.1.0-31-g1ccf6db7 (1ccf6db7f29f65466de9f1a9c2001ac8d9c8d2ab) quincy (dev)": 288
},
"mds": {},
"rgw": {
"ceph version 17.1.0-31-g1ccf6db7 (1ccf6db7f29f65466de9f1a9c2001ac8d9c8d2ab) quincy (dev)": 12
},
"overall": {
"ceph version 17.1.0-31-g1ccf6db7 (1ccf6db7f29f65466de9f1a9c2001ac8d9c8d2ab) quincy (dev)": 306
}
}
root@f13-h26-b04-5039ms:/var/log/ceph/1b2e99c2-a4ef-11ec-8665-ac1f6b2d5d2a
# cephadm shell version
Inferring fsid 1b2e99c2-a4ef-11ec-8665-ac1f6b2d5d2a
Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:2097a4e56a6013d4dc93fb97c270f022dd614d30783b487175f119a0adb4815e
WARN[0000] Failed to decode the keys ["machine"] from "/usr/share/containers/containers.conf".
WARN[0000] Failed to decode the keys ["machine"] from "/usr/share/containers/containers.conf".
WARN[0000] Failed to decode the keys ["machine"] from "/usr/share/containers/containers.conf".
ERROR (catatonit:7): failed to exec pid1: No such file or directory
root@f13-h26-b04-5039ms:/var/log/ceph/1b2e99c2-a4ef-11ec-8665-ac1f6b2d5d2a
# cephadm shell
Inferring fsid 1b2e99c2-a4ef-11ec-8665-ac1f6b2d5d2a
Using recent ceph image registry-proxy.engineering.redhat.com/rh-osbs/rhceph@sha256:2097a4e56a6013d4dc93fb97c270f022dd614d30783b487175f119a0adb4815e
WARN[0000] Failed to decode the keys ["machine"] from "/usr/share/containers/containers.conf".
WARN[0000] Failed to decode the keys ["machine"] from "/usr/share/containers/containers.conf".
WARN[0000] Failed to decode the keys ["machine"] from "/usr/share/containers/containers.conf".
[ceph: root@f13-h26-b04-5039ms /]# ceph --version
ceph version 16.2.7-89.el8cp (7f26d1457085fe5c0b08bb56960b598c19ac82f4) pacific (stable)
[ceph: root@f13-h26-b04-5039ms /]# radosgw-admin --version
ceph version 16.2.7-89.el8cp (7f26d1457085fe5c0b08bb56960b598c19ac82f4) pacific (stable)
[ceph: root@f13-h26-b04-5039ms /]#
root@f13-h26-b04-5039ms:/var/log/ceph/1b2e99c2-a4ef-11ec-8665-ac1f6b2d5d2a
# rpm -qa | grep cephadm
cephadm-17.1.0-0.el8.noarch
Can you please attache the output of the following command from your cluster:
podman images --filter label=ceph=True --filter dangling=false
I could finally reproduce the issue on my setup. The root cause of this mismatch you are observing is the way cephadm picks the ceph image. Right now it just uses the "most recent" image. In your setup this image is the corresponding to the pacific version. You can see this by running the following command:
podman images --format "table {{.ID}} {{.Repository}} {{.Tag}} {{.CreatedAt}} {{.Digest}}" --filter label=ceph=True
In the outtput you can see that pacific image (from registry-proxy.engineering.redhat.com) is more recent than the quincy image (from quay.ceph.io/ceph-ci/ceph).
f72d8dfbbc38 registry-proxy.engineering.redhat.com/rh-osbs/rhceph <none> 2022-02-17 03:12:06 +0000 UTC sha256:03ccec263a507bbf07ec9b25482abf4eabdeffa591af28bf4dba2c15d8db1d5b
049b6bd4192c quay.ceph.io/ceph-ci/ceph <none> 2022-02-05 16:42:21 +0000 UTC sha256:896874e2b54827a0040bd75f072f40be1114403da0a764eed2e606025e9e557c
This happens because the mix of registries used to install/Upgrade Ceph.
- Assignee set to Redouane Kachach Elhichou
- Status changed from New to Fix Under Review
- Pull request ID set to 45598
- Status changed from Fix Under Review to Resolved
Also available in: Atom
PDF