Project

General

Profile

Actions

Bug #46627

closed

ceph-volume tries to zap unrelated devices when passing --osd-id or --osd-fsid

Added by Guillaume Abrioux almost 4 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
Yes
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

when deploying a set of collocated and non-collocated odds, ceph-volume fails when trying to zap the collocated one because it tries to zap additional devices unrelated to this osd

  1. ceph-volume lvm list

====== osd.3 =======

[journal]     /dev/journals/journal1
cephx lockbox secret      
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
data device /dev/test_group/data-lv2
data uuid 2d5uOo-lKAn-VWv9-FYH7-qF5j-67mZ-WurByl
encrypted 0
journal device /dev/journals/journal1
journal uuid AEXZzV-WGrn-vEkP-zRL8-zBsT-VoH7-RC5N4Q
osd fsid e58f9555-c1ba-44a5-9407-0587d0f58598
osd id 3
osdspec affinity
type journal
vdo 0
devices /dev/sdc2
[data]        /dev/test_group/data-lv2
cephx lockbox secret      
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
data device /dev/test_group/data-lv2
data uuid 2d5uOo-lKAn-VWv9-FYH7-qF5j-67mZ-WurByl
encrypted 0
journal device /dev/journals/journal1
journal uuid AEXZzV-WGrn-vEkP-zRL8-zBsT-VoH7-RC5N4Q
osd fsid e58f9555-c1ba-44a5-9407-0587d0f58598
osd id 3
osdspec affinity
type data
vdo 0
devices /dev/sdb

====== osd.9 =======

[block]       /dev/test_group/data-lv1
block device              /dev/test_group/data-lv1
block uuid oo3hQQ-uiy2-3zME-TueI-skrf-5L6G-VVp5Ry
cephx lockbox secret
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
encrypted 0
osd fsid 58533adc-c8bb-412e-9ab7-229e8061a1fe
osd id 9
osdspec affinity
type block
vdo 0
devices /dev/sdb
  1. CEPH_VOLUME_DEBUG=1 ceph-volume --cluster ceph lvm zap --destroy --osd-fsid 58533adc-c8bb-412e-9ab7-229e8061a1fe
    --> Zapping: /dev/journals/journal1
    --> Unmounting /var/lib/ceph/osd/ceph-3
    Running command: /bin/umount -v /var/lib/ceph/osd/ceph-3
    stderr: umount: /var/lib/ceph/osd/ceph-3: target is busy.
    Traceback (most recent call last):
    File "/sbin/ceph-volume", line 11, in <module>
    load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
    File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 39, in init
    self.main(self.argv)
    File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
    return f(*a, **kw)
    File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 150, in main
    terminal.dispatch(self.mapper, subcommand_args)
    File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
    instance.main()
    File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 42, in main
    terminal.dispatch(self.mapper, self.argv)
    File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
    instance.main()
    File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 393, in main
    self.zap_osd()
    File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
    File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 299, in zap_osd
    self.zap(devices)
    File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
    return func(*a, **kw)
    File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 274, in zap
    self.zap_lv(device)
    File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 170, in zap_lv
    self.unmount_lv(lv)
    File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/zap.py", line 159, in unmount_lv
    system.unmount(lv_path)
    File "/usr/lib/python3.6/site-packages/ceph_volume/util/system.py", line 198, in unmount
    path,
    File "/usr/lib/python3.6/site-packages/ceph_volume/process.py", line 153, in run
    raise RuntimeError(msg)
    RuntimeError: command returned non-zero exit status: 32
  2. ceph-volume lvm list

====== osd.3 =======

[journal]     /dev/journals/journal1
cephx lockbox secret      
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
data device /dev/test_group/data-lv2
data uuid 2d5uOo-lKAn-VWv9-FYH7-qF5j-67mZ-WurByl
encrypted 0
journal device /dev/journals/journal1
journal uuid AEXZzV-WGrn-vEkP-zRL8-zBsT-VoH7-RC5N4Q
osd fsid e58f9555-c1ba-44a5-9407-0587d0f58598
osd id 3
osdspec affinity
type journal
vdo 0
devices /dev/sdc2
[data]        /dev/test_group/data-lv2
cephx lockbox secret      
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
data device /dev/test_group/data-lv2
data uuid 2d5uOo-lKAn-VWv9-FYH7-qF5j-67mZ-WurByl
encrypted 0
journal device /dev/journals/journal1
journal uuid AEXZzV-WGrn-vEkP-zRL8-zBsT-VoH7-RC5N4Q
osd fsid e58f9555-c1ba-44a5-9407-0587d0f58598
osd id 3
osdspec affinity
type data
vdo 0
devices /dev/sdb

====== osd.9 =======

[block]       /dev/test_group/data-lv1
block device              /dev/test_group/data-lv1
block uuid oo3hQQ-uiy2-3zME-TueI-skrf-5L6G-VVp5Ry
cephx lockbox secret
cluster fsid 81ec0ac6-ffe7-4d2d-a152-9d2415c3de12
cluster name ceph
crush device class None
encrypted 0
osd fsid 58533adc-c8bb-412e-9ab7-229e8061a1fe
osd id 9
osdspec affinity
type block
vdo 0
devices /dev/sdb

We can see here although I've tried to zap the osd '58533adc-c8bb-412e-9ab7-229e8061a1fe' (osd.9) passing '--osd-fsid' to the `ceph-volume lvm zap` command, ceph-volume attempted to zap /dev/journals/journal1 which belongs to osd.3 (--> Zapping: /dev/journals/journal1)

this is a regression that was introduced by commit 2f5c10c12c37e6865ce54bb4940d3779353cba4f

I think when calling `api.get_lvs()` in `ensure_associated_lvs()` it should add the osd_id or osd_fsid information to exclude any unrelated devices

Actions #1

Updated by Guillaume Abrioux almost 4 years ago

  • Pull request ID set to 36219
Actions #2

Updated by Guillaume Abrioux over 3 years ago

  • Status changed from New to Pending Backport
Actions #4

Updated by Jan Fajerski over 3 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF