Project

General

Profile

Actions

Bug #23001

open

ceph-volume should destroy vgs and lvs on OSD creation failure

Added by David Galloway about 6 years ago. Updated over 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

root@reesi001:~# ceph-volume lvm prepare --bluestore --data /dev/sda --journal /dev/journals/lvol0
Running command: ceph-authtool --gen-print-key
Running command: ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 43d2cd4c-af19-4e0f-bb2e-816cb4c5bcf4
Running command: vgcreate --force --yes ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a /dev/sda
 stdout: Volume group "ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a" successfully created
Running command: lvcreate --yes -l 100%FREE -n osd-block-43d2cd4c-af19-4e0f-bb2e-816cb4c5bcf4 ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a
 stdout: Logical volume "osd-block-43d2cd4c-af19-4e0f-bb2e-816cb4c5bcf4" created.
Running command: ceph-authtool --gen-print-key
--> Was unable to complete a new OSD, will rollback changes
--> OSD will be fully purged from the cluster, because the ID was generated
Running command: ceph osd purge osd.95 --yes-i-really-mean-it
 stderr: purged osd.95
-->  RuntimeError: "ceph" user is not available in the current system

root@reesi001:~# lvdisplay /dev/ceph*
  --- Logical volume ---
  LV Path                /dev/ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a/osd-block-43d2cd4c-af19-4e0f-bb2e-816cb4c5bcf4
  LV Name                osd-block-43d2cd4c-af19-4e0f-bb2e-816cb4c5bcf4
  VG Name                ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a
  LV UUID                PWZkYb-odTe-mhOt-fCp9-1qvc-fL1n-dsfn5Y
  LV Write Access        read/write
  LV Creation host, time reesi001, 2018-02-14 11:21:36 -0500
  LV Status              available
  # open                 0
  LV Size                3.64 TiB
  Current LE             953861
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     256
  Block device           252:12

If creating a new OSD and there's no data on the drive, wouldn't it make sense to remove the logical volume and volume group that was created so the device(s) can be reused?

Actions #1

Updated by David Galloway about 6 years ago

I get the following when attempting to reuse a device that already has a lv and vg

[2018-02-14 11:19:39,536][ceph_volume.process][INFO  ] Running command: ceph-authtool --gen-print-key
[2018-02-14 11:19:39,564][ceph_volume.process][INFO  ] stdout AQCbYYRaMyt+IRAAqHLEjNhqfUwL2nqZUOOcPA==
[2018-02-14 11:19:39,564][ceph_volume.process][INFO  ] Running command: ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new 123a232f-c500-424b-8886-0b5152440fe7
[2018-02-14 11:19:39,850][ceph_volume.process][INFO  ] stdout 95
[2018-02-14 11:19:39,851][ceph_volume.process][INFO  ] Running command: lsblk --nodeps -P -o NAME,KNAME,MAJ:MIN,FSTYPE,MOUNTPOINT,LABEL,UUID,RO,RM,MODEL,SIZE,STATE,OWNER,GROUP,MODE,ALIGNMENT,PHY-SEC,LOG-SEC,ROTA,SCHED,TYPE,DISC-ALN,DISC-GRAN,DISC-MAX,DISC-ZERO,PKNAME,PARTLABEL /dev/sda
[2018-02-14 11:19:39,856][ceph_volume.process][INFO  ] stdout NAME="sda" KNAME="sda" MAJ:MIN="8:0" FSTYPE="LVM2_member" MOUNTPOINT="" LABEL="" UUID="2veNK8-aBMq-g6HU-B184-ShOc-BDsj-AVHCzV" RO="0" RM="0" MODEL="ST4000NM0025    " SIZE="3.7T" STATE="running" OWNER="root" GROUP="disk" MODE="brw-rw----" ALIGNMENT="0" PHY-SEC="512" LOG-SEC="512" ROTA="1" SCHED="deadline" TYPE="disk" DISC-ALN="0" DISC-GRAN="0B" DISC-MAX="0B" DISC-ZERO="0" PKNAME="" PARTLABEL="" 
[2018-02-14 11:19:39,856][ceph_volume.process][INFO  ] Running command: lsblk --nodeps -P -o NAME,KNAME,MAJ:MIN,FSTYPE,MOUNTPOINT,LABEL,UUID,RO,RM,MODEL,SIZE,STATE,OWNER,GROUP,MODE,ALIGNMENT,PHY-SEC,LOG-SEC,ROTA,SCHED,TYPE,DISC-ALN,DISC-GRAN,DISC-MAX,DISC-ZERO,PKNAME,PARTLABEL /dev/sda
[2018-02-14 11:19:39,862][ceph_volume.process][INFO  ] stdout NAME="sda" KNAME="sda" MAJ:MIN="8:0" FSTYPE="LVM2_member" MOUNTPOINT="" LABEL="" UUID="2veNK8-aBMq-g6HU-B184-ShOc-BDsj-AVHCzV" RO="0" RM="0" MODEL="ST4000NM0025    " SIZE="3.7T" STATE="running" OWNER="root" GROUP="disk" MODE="brw-rw----" ALIGNMENT="0" PHY-SEC="512" LOG-SEC="512" ROTA="1" SCHED="deadline" TYPE="disk" DISC-ALN="0" DISC-GRAN="0B" DISC-MAX="0B" DISC-ZERO="0" PKNAME="" PARTLABEL="" 
[2018-02-14 11:19:39,863][ceph_volume.process][INFO  ] Running command: vgs --noheadings --separator=";" -o vg_name,pv_count,lv_count,snap_count,vg_attr,vg_size,vg_free
[2018-02-14 11:19:39,878][ceph_volume.process][INFO  ] stdout ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a";"1";"1";"0";"wz--n-";"3.64t";"0
[2018-02-14 11:19:39,879][ceph_volume.process][INFO  ] stdout journals";"1";"12";"0";"wz--n-";"372.60g";"616.00m
[2018-02-14 11:19:39,879][ceph_volume.process][INFO  ] stdout osd";"1";"0";"0";"wz--n-";"365.15g";"365.15g
[2018-02-14 11:19:39,879][ceph_volume.process][INFO  ] Running command: vgcreate --force --yes ceph-6848c821-6673-41e3-a91e-dc5d61434728 /dev/sda
[2018-02-14 11:19:39,894][ceph_volume.process][INFO  ] stderr Physical volume '/dev/sda' is already in volume group 'ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a'
  Unable to add physical volume '/dev/sda' to volume group 'ceph-6848c821-6673-41e3-a91e-dc5d61434728'.
[2018-02-14 11:19:39,894][ceph_volume.devices.lvm.prepare][ERROR ] lvm prepare was unable to complete
[2018-02-14 11:19:39,895][ceph_volume.devices.lvm.prepare][INFO  ] will rollback OSD ID creation
[2018-02-14 11:19:39,895][ceph_volume.process][INFO  ] Running command: ceph osd purge osd.95 --yes-i-really-mean-it
[2018-02-14 11:19:40,320][ceph_volume.process][INFO  ] stderr purged osd.95
[2018-02-14 11:19:40,335][ceph_volume][ERROR ] exception caught by decorator
Traceback (most recent call last):
  File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 59, in newfunc
    return f(*a, **kw)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/main.py", line 152, in main
    terminal.dispatch(self.mapper, subcommand_args)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 182, in dispatch
    instance.main()
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/main.py", line 38, in main
    terminal.dispatch(self.mapper, self.argv)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/terminal.py", line 182, in dispatch
    instance.main()
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/prepare.py", line 365, in main
    self.safe_prepare(args)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/prepare.py", line 216, in safe_prepare
    self.prepare(args)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/prepare.py", line 282, in prepare
    block_lv = self.prepare_device(args.data, 'block', cluster_fsid, osd_fsid)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/devices/lvm/prepare.py", line 196, in prepare_device
    api.create_vg(vg_name, arg)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/api/lvm.py", line 209, in create_vg
    name] + list(devices)
  File "/usr/lib/python2.7/dist-packages/ceph_volume/process.py", line 138, in run
    raise RuntimeError(msg)
RuntimeError: command returned non-zero exit status: 5
Actions #2

Updated by Alfredo Deza almost 6 years ago

  • Status changed from New to 12

We can't destroy a vg/lv automatically because it is entirely possible to have other OSDs living in lvs that come from the same vg.

However, what should've been better in this situation is to report back when LVM found this:

[2018-02-14 11:19:39,894][ceph_volume.process][INFO  ] stderr Physical volume '/dev/sda' is already in volume group 'ceph-28f7427e-5558-4ffd-ae1a-51ec3042759a'

A pre-check that catches this would be good.

Actions #3

Updated by Patrick Donnelly over 4 years ago

  • Status changed from 12 to New
Actions

Also available in: Atom PDF