Bug #54572
open
Crimson OSD containers are not configured in kubernetes cluster
Added by Srinivasa Bharath Kanta about 2 years ago.
Updated about 2 years ago.
Description
Configured a CEPH cluster on Kubernetes(k8s) with one controller and 2 worker nodes and noticed that the OSD containers are not configured on the cluster.
Error snippet:
--------------
[root@bruuni010 ubuntu]# kubectl n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-hrszj 3/3 Running 0 37h
csi-cephfsplugin-pjvnb 3/3 Running 0 37h
csi-cephfsplugin-provisioner-5dc9cbcc87-lkq2w 6/6 Running 0 37h
csi-cephfsplugin-provisioner-5dc9cbcc87-m7lqn 6/6 Running 0 37h
csi-rbdplugin-nw5lg 3/3 Running 0 37h
csi-rbdplugin-provisioner-58f584754c-4n4n6 6/6 Running 0 37h
csi-rbdplugin-provisioner-58f584754c-vfshm 6/6 Running 0 37h
csi-rbdplugin-w22p7 3/3 Running 0 37h
rook-ceph-mon-a-77ff7d978d-mp98c 1/1 Running 0 37h
rook-ceph-operator-75dd789779-gcs29 1/1 Running 0 37h
rook-ceph-tools-d6d7c985c-ss2fm 1/1 Running 0 37h
[root@bruuni010 ubuntu]# kubectl -n rook-ceph exec -it rook-ceph-tools-d6d7c985c-ss2fm - bash
[rook@rook-ceph-tools-d6d7c985c-ss2fm /]$ ceph -v
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)
[rook@rook-ceph-tools-d6d7c985c-ss2fm /]$ ceph -s
cluster:
id: 6f23811a-41fc-48c7-a138-0ab6bea6d92b
health: HEALTH_WARN
mon is allowing insecure global_id reclaim
services:
mon: 1 daemons, quorum a (age 37h)
mgr: no daemons active
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
[rook@rook-ceph-tools-d6d7c985c-ss2fm /]$
How to reproduce:
-----------------
Please refer to the attached "CEPH installation on Kubernetes.pdf" to reproduce the bugs.
Files
[rook@rook-ceph-tools-d6d7c985c-ss2fm /]$ cat /etc/redhat-release
CentOS Stream release 8
[rook@rook-ceph-tools-d6d7c985c-ss2fm /]$ exit
[root@bruuni010 ubuntu]# cat /etc/redhat-release
Red Hat Enterprise Linux release 8.4 (Ootpa)
[root@bruuni010 ubuntu]#
I tested with the classical OSD, the cluster is up and running-
[root@bruuni010 examples]# kubectl n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-ff4j2 3/3 Running 0 5m36s
csi-cephfsplugin-j62lf 3/3 Running 0 5m36s
csi-cephfsplugin-provisioner-5dc9cbcc87-4plws 6/6 Running 0 5m36s
csi-cephfsplugin-provisioner-5dc9cbcc87-lzqcq 6/6 Running 0 5m36s
csi-rbdplugin-mdtw6 3/3 Running 0 5m36s
csi-rbdplugin-provisioner-58f584754c-hn887 6/6 Running 0 5m36s
csi-rbdplugin-provisioner-58f584754c-vr82h 6/6 Running 0 5m36s
csi-rbdplugin-q4q74 3/3 Running 0 5m36s
rook-ceph-mgr-a-757559786d-mp426 1/1 Running 0 4m9s
rook-ceph-mon-a-759b6f77d4-b6qpk 1/1 Running 0 4m34s
rook-ceph-operator-75dd789779-qw7md 1/1 Running 0 26m
rook-ceph-osd-0-57879c8cb6-9rwh5 1/1 Running 0 3m19s
rook-ceph-osd-1-5d96fc9746-n4wwt 1/1 Running 0 3m19s
rook-ceph-osd-2-655d759b7-kzqk2 1/1 Running 0 3m19s
rook-ceph-osd-3-786ff578f6-thtxd 1/1 Running 0 3m19s
rook-ceph-osd-4-6bc464fb4f-2gpvb 1/1 Running 0 3m1s
rook-ceph-osd-5-85867cc877-k9z7d 1/1 Running 0 3m1s
rook-ceph-osd-6-5cbbc7ddf7-xw8rv 1/1 Running 0 3m
rook-ceph-osd-7-64cfbff55b-zq9mn 1/1 Running 0 3m
rook-ceph-osd-prepare-bruuni011-68v2q 0/1 Completed 0 3m48s
rook-ceph-osd-prepare-bruuni012-jcfwg 0/1 Completed 0 3m48s
rook-ceph-tools-d6d7c985c-lm5gl 1/1 Running 0 36s
[root@bruuni010 examples]# kubectl -n rook-ceph exec -it rook-ceph-tools-d6d7c985c-lm5gl - bash
[rook@rook-ceph-tools-d6d7c985c-lm5gl /]$ ceph -v
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)
[rook@rook-ceph-tools-d6d7c985c-lm5gl /]$ ceph -s
cluster:
id: 36b29beb-109a-48d8-91a0-e508a54dab21
health: HEALTH_OK
services:
mon: 1 daemons, quorum a (age 5m)
mgr: a(active, since 4m)
osd: 8 osds: 8 up (since 3m), 8 in (since 3m)
data:
pools: 1 pools, 1 pgs
objects: 2 objects, 577 KiB
usage: 910 MiB used, 12 TiB / 12 TiB avail
pgs: 1 active+clean
[rook@rook-ceph-tools-d6d7c985c-lm5gl /]$
Classical Build: image: quay.ceph.io/ceph-ci/ceph:81f31fdba1dcb9d1e600c0b06f4d038d846292be
Cluster creation failed with the crimson build:
root@bruuni010 examples]# kubectl n rook-ceph get pod
NAME READY STATUS RESTARTS AGE
csi-cephfsplugin-5htcz 3/3 Running 0 4m24s
csi-cephfsplugin-dkpmv 3/3 Running 0 4m24s
csi-cephfsplugin-provisioner-5dc9cbcc87-z6qqm 6/6 Running 0 4m24s
csi-cephfsplugin-provisioner-5dc9cbcc87-zv54s 6/6 Running 0 4m24s
csi-rbdplugin-7rs6x 3/3 Running 0 4m24s
csi-rbdplugin-provisioner-58f584754c-rht4h 6/6 Running 0 4m24s
csi-rbdplugin-provisioner-58f584754c-sjwlf 6/6 Running 0 4m24s
csi-rbdplugin-rct4m 3/3 Running 0 4m24s
rook-ceph-mgr-a-67c8877b57-s49m4 1/1 Running 0 3m10s
rook-ceph-mon-a-7b96bb44fd-vs94w 1/1 Running 0 4m15s
rook-ceph-operator-75dd789779-jzzfx 1/1 Running 0 6m49s
rook-ceph-osd-prepare-bruuni011-4tkz6 0/1 Completed 0 2m49s
rook-ceph-osd-prepare-bruuni012-ttbv9 0/1 Completed 0 2m49s
rook-ceph-tools-d6d7c985c-xq8pv 1/1 Running 0 11s
[root@bruuni010 examples]# kubectl -n rook-ceph exec -it rook-ceph-tools-d6d7c985c-xq8pv - bash
[rook@rook-ceph-tools-d6d7c985c-xq8pv /]$ ceph -s
cluster:
id: 3fe3b3e2-2499-4c04-8432-2e3f70b54368
health: HEALTH_WARN
OSD count 0 < osd_pool_default_size 1
services:
mon: 1 daemons, quorum a (age 4m)
mgr: a(active, since 2m)
osd: 0 osds: 0 up, 0 in
data:
pools: 0 pools, 0 pgs
objects: 0 objects, 0 B
usage: 0 B used, 0 B / 0 B avail
pgs:
[rook@rook-ceph-tools-d6d7c985c-xq8pv /]$ ceph -v
ceph version 16.2.7 (dd0603118f56ab514f133c8d2e3adfc983942503) pacific (stable)
[rook@rook-ceph-tools-d6d7c985c-xq8pv /]$
Crimson build used - image: quay.ceph.io/ceph-ci/ceph:7cad69fd931959456ff76a950c0b0afcf3f964d1-crimson
- Status changed from New to Need More Info
From the log:
2022-03-30 15:26:20.234554 I | cephosd: skipping device "sda1" because it contains a filesystem "ext4"
2022-03-30 15:26:20.234557 I | cephosd: skipping device "sdb" because it contains a filesystem "LVM2_member"
2022-03-30 15:26:20.234560 I | cephosd: skipping device "sdc" because it contains a filesystem "LVM2_member"
2022-03-30 15:26:20.234562 I | cephosd: skipping device "sdd" because it contains a filesystem "LVM2_member"
2022-03-30 15:26:20.234565 I | cephosd: skipping device "nvme1n1" because it contains a filesystem "LVM2_member"
2022-03-30 15:26:20.234569 I | cephosd: skipping device "nvme0n1" because it contains a filesystem "LVM2_member"
2022-03-30 15:26:20.238227 I | cephosd: configuring osd devices: {"Entries":{}}
This looks like a zapping issue. Need a log from this phase is necessary to be sure,
Also available in: Atom
PDF