Bug #46627
closedceph-volume tries to zap unrelated devices when passing --osd-id or --osd-fsid
0%
Description
when deploying a set of collocated and non-collocated odds, ceph-volume fails when trying to zap the collocated one because it tries to zap additional devices unrelated to this osd
- ceph-volume lvm list
====== osd.3 =======
[journal] /dev/journals/journal1
cephx lockbox secret
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
data device /dev/test_group/data-lv2
data uuid 2d5uOo-lKAn-VWv9-FYH7-qF5j-67mZ-WurByl
encrypted 0
journal device /dev/journals/journal1
journal uuid AEXZzV-WGrn-vEkP-zRL8-zBsT-VoH7-RC5N4Q
osd fsid e58f9555-c1ba-44a5-9407-0587d0f58598
osd id 3
osdspec affinity
type journal
vdo 0
devices /dev/sdc2
[data] /dev/test_group/data-lv2
cephx lockbox secret
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
data device /dev/test_group/data-lv2
data uuid 2d5uOo-lKAn-VWv9-FYH7-qF5j-67mZ-WurByl
encrypted 0
journal device /dev/journals/journal1
journal uuid AEXZzV-WGrn-vEkP-zRL8-zBsT-VoH7-RC5N4Q
osd fsid e58f9555-c1ba-44a5-9407-0587d0f58598
osd id 3
osdspec affinity
type data
vdo 0
devices /dev/sdb
====== osd.9 =======
[block] /dev/test_group/data-lv1
block device /dev/test_group/data-lv1
block uuid oo3hQQ-uiy2-3zME-TueI-skrf-5L6G-VVp5Ry
cephx lockbox secret
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
encrypted 0
osd fsid 58533adc-c8bb-412e-9ab7-229e8061a1fe
osd id 9
osdspec affinity
type block
vdo 0
devices /dev/sdb
- CEPH_VOLUME_DEBUG=1 ceph-volume --cluster ceph lvm zap --destroy --osd-fsid 58533adc-c8bb-412e-9ab7-229e8061a1fe
--> Zapping: /dev/journals/journal1
--> Unmounting /var/lib/ceph/osd/ceph-3
Running command: /bin/umount -v /var/lib/ceph/osd/ceph-3
stderr: umount: /var/lib/ceph/osd/ceph-3: target is busy.
Traceback (most recent call last):
File "/sbin/ceph-volume", line 11, in <module>
load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 39, in init
self.main(self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 150, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 42, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 393, in main
self.zap_osd()
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 299, in zap_osd
self.zap(devices)
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 274, in zap
self.zap_lv(device)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 170, in zap_lv
self.unmount_lv(lv)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 159, in unmount_lv
system.unmount(lv_path)
File "/usr/lib/python3.6/site-packages/ceph_volume/util/system.py", line 198, in unmount
path,
File "/usr/lib/python3.6/site-packages/ceph_volume/process.py", line 153, in run
raise RuntimeError(msg)
RuntimeError: command returned non-zero exit status: 32 - ceph-volume lvm list
====== osd.3 =======
[journal] /dev/journals/journal1
cephx lockbox secret
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
data device /dev/test_group/data-lv2
data uuid 2d5uOo-lKAn-VWv9-FYH7-qF5j-67mZ-WurByl
encrypted 0
journal device /dev/journals/journal1
journal uuid AEXZzV-WGrn-vEkP-zRL8-zBsT-VoH7-RC5N4Q
osd fsid e58f9555-c1ba-44a5-9407-0587d0f58598
osd id 3
osdspec affinity
type journal
vdo 0
devices /dev/sdc2
[data] /dev/test_group/data-lv2
cephx lockbox secret
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
data device /dev/test_group/data-lv2
data uuid 2d5uOo-lKAn-VWv9-FYH7-qF5j-67mZ-WurByl
encrypted 0
journal device /dev/journals/journal1
journal uuid AEXZzV-WGrn-vEkP-zRL8-zBsT-VoH7-RC5N4Q
osd fsid e58f9555-c1ba-44a5-9407-0587d0f58598
osd id 3
osdspec affinity
type data
vdo 0
devices /dev/sdb
====== osd.9 =======
[block] /dev/test_group/data-lv1
block device /dev/test_group/data-lv1
block uuid oo3hQQ-uiy2-3zME-TueI-skrf-5L6G-VVp5Ry
cephx lockbox secret
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
encrypted 0
osd fsid 58533adc-c8bb-412e-9ab7-229e8061a1fe
osd id 9
osdspec affinity
type block
vdo 0
devices /dev/sdb
We can see here although I've tried to zap the osd '58533adc-c8bb-412e-9ab7-229e8061a1fe' (osd.9) passing '--osd-fsid' to the `ceph-volume lvm zap` command, ceph-volume attempted to zap /dev/journals/journal1 which belongs to osd.3 (--> Zapping: /dev/journals/journal1)
this is a regression that was introduced by commit 2f5c10c12c37e6865ce54bb4940d3779353cba4f
I think when calling `api.get_lvs()` in `ensure_associated_lvs()` it should add the osd_id or osd_fsid information to exclude any unrelated devices
Updated by Guillaume Abrioux over 3 years ago
- Status changed from New to Pending Backport
Updated by Guillaume Abrioux over 3 years ago
octopus backport : https://github.com/ceph/ceph/pull/35879
nautilus backport : https://github.com/ceph/ceph/pull/35878
mimic backport : https://github.com/ceph/ceph/pull/35900
Updated by Jan Fajerski over 3 years ago
- Status changed from Pending Backport to Resolved