Project

General

Profile

Bug #48106

ceph-volume lvm batch doesn't work anymore with --auto and full SSD/NVMe devices

Added by Dimitri Savineau 6 months ago. Updated 6 months ago.

Status:
Resolved
Priority:
Normal
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
octopus,nautilus
Regression:
Yes
Severity:
2 - major
Reviewed:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Since the lvm batch refactor [1], it's not possible anymore to run

# ceph-volume --cluster ceph lvm batch --bluestore --yes /dev/nvme0n1 /dev/nvme1n1 /dev/nvme2n1 /dev/nvme3n1 /dev/nvme4n1 /dev/nvme5n1 /dev/nvme6n1 /dev/nvme7n1 /dev/nvme8n1 /dev/nvme9n1 --report --format=json
--> DEPRECATION NOTICE
--> You are using the legacy automatic disk sorting behavior
--> The Pacific release will change the default to --no-auto
--> passed data devices: 0 physical, 0 LVM
--> relative data size: 1.0
--> All data devices are unavailable
[]

In 14.2.12 this will create 10 OSDs (one on each NVMe device)
In 14.2.13 this does nothing.

The only way to workaround the issue is to use the --no-auto parameter but the --auto parameter (which is the default) should be backward compatible and not break the existing workflow.
According to the document [2] it should still be possible to have the old workflow with --auto.

For example assuming bluestore is used and --no-auto is not passed, the deprecated behavior would deploy the following, depending on the devices passed:

Devices are all spinning HDDs: 1 OSD is created per device

Devices are all SSDs: 2 OSDs are created per device

Devices are a mix of HDDs and SSDs: data is placed on the spinning device, the block.db is created on the SSD, as large as possible.

1/ Full HDD devices is currently working as expected
2/ Full SSD/NVMe devices is broken
3/ Mixed HDD and SSD/NVMe devices is currently working as expected

Scenario 2/ is broken because ceph-volume filters the rotation and non rotational devices. The non rotational devices are assigned to either bluestore db or filestore journal but this should be done only for scenario 3/ not 2/

This results by having all NVMe devices reaffected to bluestore db devices and there's no devices as input anymore.

[1] https://github.com/ceph/ceph/pull/37522
[2] https://docs.ceph.com/en/latest/ceph-volume/lvm/batch/#automatic-sorting-of-disks


Related issues

Copied to ceph-volume - Backport #48184: octopus: ceph-volume lvm batch doesn't work anymore with --auto and full SSD/NVMe devices Resolved
Copied to ceph-volume - Backport #48185: nautilus: ceph-volume lvm batch doesn't work anymore with --auto and full SSD/NVMe devices Resolved

History

#1 Updated by Jan Fajerski 6 months ago

  • Status changed from New to Fix Under Review
  • Backport set to octopus,nautilus
  • Pull request ID set to 37942

#2 Updated by Jan Fajerski 6 months ago

  • Status changed from Fix Under Review to Pending Backport

#3 Updated by Nathan Cutler 6 months ago

  • Copied to Backport #48184: octopus: ceph-volume lvm batch doesn't work anymore with --auto and full SSD/NVMe devices added

#4 Updated by Nathan Cutler 6 months ago

  • Copied to Backport #48185: nautilus: ceph-volume lvm batch doesn't work anymore with --auto and full SSD/NVMe devices added

#5 Updated by Nathan Cutler 6 months ago

  • Assignee set to Dimitri Savineau

#6 Updated by Nathan Cutler 6 months ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Also available in: Atom PDF