Project

General

Profile

Actions

Bug #50526

closed

OSD massive creation: OSDs not created

Added by Juan Miguel Olmo Martínez about 3 years ago. Updated almost 3 years ago.

Status:
Resolved
Priority:
Urgent
Category:
-
Target version:
-
% Done:

100%

Source:
Tags:
Backport:
pacific
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

OSDs are not created when the drive group used to launch the osd creation affect a big number of OSDs (75 in my case).

Symptoms:
- OSD service does not show the right number of OSDs

# ceph orch ls
osd.defaultDG      0/3  -          -    f22-h21-000-6048r.rdu2.scalelab.redhat.com;f22-h25-000-6048r.rdu2.scalelab.redhat.com;f22-h29-000-6048r.rdu2.scalelab.redhat.com

- Big number of OSD created but 0 out

#ceph -s
    osd: 73 osds: 0 up, 54 in (since 62m)

- In all the hosts where OSDs must be created:

There is a permanent file lock caused by the "cephadm ceph-volume lvm list"

root@f22-h21-000-6048r:/var/log
# lslocks
COMMAND            PID  TYPE SIZE MODE  M START END PATH
...
python3         363847 FLOCK   0B WRITE 0     0   0 /run/cephadm/2c67b4d8-a439-11eb-b919-bc97e17cee60.lock
...
# ps -ef | grep 363847
root      363847  361214  0 15:13 ?        00:00:00 /usr/bin/python3 /var/lib/ceph/2c67b4d8-a439-11eb-b919-bc97e17cee60/cephadm.6268970e1745c66ce4f3d1de4aa246ccd1c5684345596e8d04a3ed72ad870349 --image registry.redhat.io/rhceph-beta/rhceph-5-rhel8@sha256:24c617082680ef85c43c6e2c4fe462c69805d2f38df83e51f968cec6b1c097a2 ceph-volume --fsid 2c67b4d8-a439-11eb-b919-bc97e17cee60 -- lvm list --format json
root      364262  353802  0 15:17 pts/2    00:00:00 grep --color=auto 363847

- In all the hosts where OSDs must be created:

OSD systemd services are not created

# systemctl list-units ceph*@osd*
0 loaded units listed. Pass --all to see loaded but inactive units, too.
To show all installed unit files use 'systemctl list-unit-files'.

- In all the hosts where OSDs must be created:

The infrastructure needed for the osds seems to be created

# lsblk
NAME                                                                                                  MAJ:MIN RM   SIZE RO TYPE  MOUNTPOINT
...
sdc                                                                                                     8:32   0   1.8T  0 disk  
`-ceph--229b758f--ccf1--4014--84e3--a526b5d5cefc-osd--block--b80cbfa5--610b--47c8--b805--83471b2a0c64 253:94   0   1.8T  0 lvm   
sdd                                                                                                     8:48   0   1.8T  0 disk  
`-ceph--557721b5--61d2--4dcd--8452--30e68e442a0a-osd--block--0d703fbf--81fd--44cc--a789--42a6cf8606a8 253:96   0   1.8T  0 lvm   
sde                                                                                                     8:64   0   1.8T  0 disk  
`-ceph--b28c6069--5ca8--45d8--b854--b09106e2fd75-osd--block--53b82fbe--761e--4e6a--9b7d--258a0a4196e0 253:98   0   1.8T  0 lvm   
sdf                                                                                                     8:80   0   1.8T  0 disk  
`-ceph--a16adf39--5f21--4531--9837--1394cad47a80-osd--block--1f31b9fd--5165--43a6--8aee--c519da4b63b3 253:100  0   1.8T  0 lvm   
sdg                                                                                                     8:96   0   1.8T  0 disk  
...  

nvme0n1                                                                                               259:0    0 745.2G  0 disk  
|-ceph--4f9cbc13--1649--4f1a--94d7--772de7da6646-osd--db--1e85d960--2fdd--42d3--9f56--c62b7a8b085a    253:95   0  62.1G  0 lvm   
....

Related issues 2 (0 open2 closed)

Related to Orchestrator - Feature #48292: cephadm: allow more than 60 OSDs per hostResolvedSebastian Wagner

Actions
Related to Orchestrator - Bug #47873: /usr/lib/sysctl.d/90-ceph-osd.conf getting installed in container, rendering it ineffectiveResolvedMichael Fritch

Actions
Actions

Also available in: Atom PDF