Project

General

Profile

Bug #57907

ceph-volume complains about "Insufficient space (<5GB)" on 1.75TB device

Added by Björn Lässig 3 months ago. Updated 3 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On a one week old working cluster 17.2.5, i try to add another host with 2 SSDs and 4 HDDs.
None of them is shown as available.
ceph-volume rejects these devices:

root@marvin09:/# ceph-volume inventory /dev/sda

====== Device report /dev/sda ======

     path                      /dev/sda
     ceph device               False
     lsm data                  {}
     available                 False
     rejected reasons          Insufficient space (<5GB)
     device id                 INTEL_SSDSC2KB019T8_PHYF908201QP1P9DGN
root@marvin09:/# lsblk /dev/sda 
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
sda    8:0    1  1.8T  0 disk 

After some diving into the code, i found that all my device
are removable cauce "Hot Plug" was activated at Bios-Level.
This took me some days.

disable_hotplug.png View (136 KB) Björn Lässig, 10/20/2022 04:08 PM

c_v_i_1.txt View - with hot plug disabled: ceph-volume inventory --format json (10 KB) Björn Lässig, 10/28/2022 01:05 PM

History

#1 Updated by Björn Lässig 3 months ago

The Problem is that in util/device.py line 582

The call for int(self.sys_api.get('size', 0)) is always 0 if sys_api is an empty dictionary.
But why is self.sys_api is empty. It is filled with live in Device.__init__

self.sys_api = sys_info.devices.get(self.path, {})

self.path is /dev/sda (in my example)

sys_info is imported:

from ceph_volume import sys_info

__init__.py says:

sys_info = namedtuple('sys_info', ['devices'])
sys_info.devices = dict()

so the data comes from:
sys_info.devices = disk.get_devices()

and disk.get_devices ignores devices for a lot of reasons without telling a thing and this is wrong.

#2 Updated by Björn Lässig 3 months ago

I add a workaround screenshot to disable Hotplug in Bios.

#3 Updated by Guillaume Abrioux 3 months ago

  • Severity changed from 2 - major to 3 - minor

#4 Updated by Björn Lässig 3 months ago

Actually, this is a major bug for me, as i have to reboot the complete host, to replace one OSD.

#5 Updated by Guillaume Abrioux 3 months ago

can you share the output of `ceph-volume inventory --format json` ?

#6 Updated by Björn Lässig 3 months ago

Guillaume Abrioux wrote:

can you share the output of `ceph-volume inventory --format json` ?

With hotplug disabled:

[root@marvin09:~] 4s # cephadm shell -- ceph-volume inventory --format json  | jq .   2>&1 > c_v_i_1.txt
Inferring fsid 3ea35780-4ba7-11ed-97fd-f367ef0b52b0
Inferring config /var/lib/ceph/3ea35780-4ba7-11ed-97fd-f367ef0b52b0/config/ceph.conf
Using ceph image with id 'cc65afd6173a' and tag '<none>' created on 2022-10-17 23:41:41 +0000 UTC
quay.io/ceph/ceph@sha256:0560b16bec6e84345f29fb6693cd2430884e6efff16a95d5bdd0bb06d7661c45

With hotplug enabled, all disks except /dev/nvme* are not listed.
I have meanwhile added the host to the cluster with HotPlug deactivated.

#7 Updated by Guillaume Abrioux 3 months ago

I would need it with hotplug enabled.

anyway, I tried to reproduce

[root@57907-1 /]# lsblk /dev/vdk
NAME MAJ:MIN RM  SIZE RO TYPE MOUNTPOINT
vdk  253:160  0  1.8T  0 disk
[root@57907-1 /]# ceph-volume inventory /dev/vdk

====== Device report /dev/vdk ======

     path                      /dev/vdk
     ceph device               False
     lsm data                  {}
     available                 False
     rejected reasons          removable
     device id
     removable                 1
     ro                        0
     vendor                    0x1af4
     model
     sas address
     rotational                1
     scheduler mode            none
     human readable size       1.76 TB
[root@57907-1 /]# ceph --version
ceph version 17.2.5 (98318ae89f1a893a6ded3a640405cdbb33e08757) quincy (stable)
[root@57907-1 /]#

in my environment, it doesn't seem to reproduce. Either I'm missing something or there must be something different.
still investigating..

Also available in: Atom PDF