Project

General

Profile

Actions

Bug #48271

closed

ceph-volume lvm batch fails activating filestore dymcrypt osds

Added by Guillaume Abrioux over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Target version:
% Done:

0%

Source:
Tags:
Backport:
octopus,nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

--> DEPRECATION NOTICE
--> --journal-size as integer is parsed as megabytes
--> A future release will parse integers as bytes
--> Add a "M" to explicitly pass a megabyte size
--> DEPRECATION NOTICE
--> You are using the legacy automatic disk sorting behavior
--> The Pacific release will change the default to --no-auto
--> passed data devices: 2 physical, 0 LVM
--> relative data size: 1.0
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph-authtool --gen-print-key
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new afb6a7a0-686d-44ba-a5d3-7e540c46a2e1
Running command: /sbin/vgcreate --force --yes ceph-328820d0-9d3b-4150-8a52-be424cc459b2 /dev/sdb
stdout: Physical volume "/dev/sdb" successfully created.
stdout: Volume group "ceph-328820d0-9d3b-4150-8a52-be424cc459b2" successfully created
Running command: /sbin/lvcreate --yes -l 1280 -n osd-journal-cb6fd517-434b-4eff-9b00-045e00b177e3 ceph-328820d0-9d3b-4150-8a52-be424cc459b2
stdout: Logical volume "osd-journal-cb6fd517-434b-4eff-9b00-045e00b177e3" created.
Running command: /sbin/lvcreate --yes -l 11519 -n osd-data-afb6a7a0-686d-44ba-a5d3-7e540c46a2e1 ceph-328820d0-9d3b-4150-8a52-be424cc459b2
stdout: Logical volume "osd-data-afb6a7a0-686d-44ba-a5d3-7e540c46a2e1" created.
Running command: /bin/ceph-authtool --gen-print-key
Running command: /sbin/cryptsetup --batch-mode --key-file - luksFormat /dev/ceph-328820d0-9d3b-4150-8a52-be424cc459b2/osd-data-afb6a7a0-686d-44ba-a5d3-7e540c46a2e1
Running command: /sbin/cryptsetup --key-file - --allow-discards luksOpen /dev/ceph-328820d0-9d3b-4150-8a52-be424cc459b2/osd-data-afb6a7a0-686d-44ba-a5d3-7e540c46a2e1 guXUcp-LYmh-CnCe-Yg7b-dklC-w0ge-EBmdKB
Running command: /sbin/cryptsetup --batch-mode --key-file - luksFormat /dev/ceph-328820d0-9d3b-4150-8a52-be424cc459b2/osd-journal-cb6fd517-434b-4eff-9b00-045e00b177e3
Running command: /sbin/cryptsetup --key-file - --allow-discards luksOpen /dev/ceph-328820d0-9d3b-4150-8a52-be424cc459b2/osd-journal-cb6fd517-434b-4eff-9b00-045e00b177e3 cb6fd517-434b-4eff-9b00-045e00b177e3
Running command: /sbin/mkfs -t xfs -f -i size=2048 /dev/mapper/guXUcp-LYmh-CnCe-Yg7b-dklC-w0ge-EBmdKB
stdout: meta-data=/dev/mapper/guXUcp-LYmh-CnCe-Yg7b-dklC-w0ge-EBmdKB isize=2048 agcount=4, agsize=2947840 blks = sectsz=512 attr=2, projid32bit=1 = crc=1 finobt=1, sparse=1, rmapbt=0 = reflink=1
data = bsize=4096 blocks=11791360, imaxpct=25 = sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0, ftype=1
log =internal log bsize=4096 blocks=5757, version=2 = sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
Running command: /bin/mount t xfs -o noatime,largeio,inode64,swalloc /dev/mapper/guXUcp-LYmh-CnCe-Yg7b-dklC-w0ge-EBmdKB /var/lib/ceph/osd/ceph-0
Running command: /sbin/restorecon /var/lib/ceph/osd/ceph-0
Running command: /bin/chown -h ceph:ceph /dev/mapper/cb6fd517-434b-4eff-9b00-045e00b177e3
Running command: /bin/chown -R ceph:ceph /dev/dm-3
Running command: /bin/ln -s /dev/mapper/cb6fd517-434b-4eff-9b00-045e00b177e3 /var/lib/ceph/osd/ceph-0/journal
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring mon getmap -o /var/lib/ceph/osd/ceph-0/activate.monmap
stderr: got monmap epoch 1
Running command: /bin/chown -h ceph:ceph /var/lib/ceph/osd/ceph-0/journal
Running command: /bin/chown -R ceph:ceph /dev/dm-3
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/
Running command: /bin/ceph-osd --cluster ceph --osd-objectstore filestore --mkfs -i 0 --monmap /var/lib/ceph/osd/ceph-0/activate.monmap --keyfile - --osd-data /var/lib/ceph/osd/ceph-0/ --osd-journal /var/lib/ceph/osd/ceph-0/journal --osd-uuid afb6a7a0-686d-44ba-a5d3-7e540c46a2e1 --setuser ceph --setgroup ceph
stderr: 2020-11-18T00:12:28.113+0000 7f98b2f92ec0 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-0//keyring: (2) No such file or directory
stderr: 2020-11-18T00:12:28.115+0000 7f98b2f92ec0 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-0//keyring: (2) No such file or directory
stderr: 2020-11-18T00:12:28.115+0000 7f98b2f92ec0 -1 auth: unable to find a keyring on /var/lib/ceph/osd/ceph-0//keyring: (2) No such file or directory
stderr: 2020-11-18T00:12:28.213+0000 7f98b2f92ec0 -1 journal read_header error decoding journal header
stderr: 2020-11-18T00:12:28.297+0000 7f98b2f92ec0 -1 journal do_read_entry(4096): bad header magic
stderr: 2020-11-18T00:12:28.297+0000 7f98b2f92ec0 -1 journal do_read_entry(4096): bad header magic
Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-0/keyring --create-keyring --name osd.0 --add-key AQDQZrRf+39iCBAA6AqV/o7GahuhaE0wL9cVnQ==
stdout: creating /var/lib/ceph/osd/ceph-0/keyring
added entity osd.0 auth(key=AQDQZrRf+39iCBAA6AqV/o7GahuhaE0wL9cVnQ==)
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/keyring
Running command: /bin/ceph-authtool /var/lib/ceph/osd/ceph-0/lockbox.keyring --create-keyring --name client.osd-lockbox.afb6a7a0-686d-44ba-a5d3-7e540c46a2e1 --add-key AQDQZrRfsYxeCxAACzcXZ9gtYKzjFTk04MksXg==
stdout: creating /var/lib/ceph/osd/ceph-0/lockbox.keyring
added entity client.osd-lockbox.afb6a7a0-686d-44ba-a5d3-7e540c46a2e1 auth(key=AQDQZrRfsYxeCxAACzcXZ9gtYKzjFTk04MksXg==)
Running command: /bin/chown -R ceph:ceph /var/lib/ceph/osd/ceph-0/lockbox.keyring
-
> ceph-volume lvm prepare successful for: /dev/sdb
Running command: /bin/ceph --cluster ceph --name client.osd-lockbox.afb6a7a0-686d-44ba-a5d3-7e540c46a2e1 --keyring /var/lib/ceph/osd/ceph-0/lockbox.keyring config-key get dm-crypt/osd/afb6a7a0-686d-44ba-a5d3-7e540c46a2e1/luks
Running command: /sbin/cryptsetup --key-file - --allow-discards luksOpen /dev/ceph-328820d0-9d3b-4150-8a52-be424cc459b2/osd-data-afb6a7a0-686d-44ba-a5d3-7e540c46a2e1 guXUcp-LYmh-CnCe-Yg7b-dklC-w0ge-EBmdKB
stderr: Device guXUcp-LYmh-CnCe-Yg7b-dklC-w0ge-EBmdKB already exists.
Running command: /sbin/cryptsetup --key-file - --allow-discards luksOpen /dev/ceph-328820d0-9d3b-4150-8a52-be424cc459b2/osd-journal-cb6fd517-434b-4eff-9b00-045e00b177e3 2vBGzI-CW1R-yha9-Cg9C-hUdZ-wIJE-elllRU
stderr: Cannot use device /dev/ceph-328820d0-9d3b-4150-8a52-be424cc459b2/osd-journal-cb6fd517-434b-4eff-9b00-045e00b177e3 which is in use (already mapped or mounted).
Running command: /bin/chown R ceph:ceph /var/lib/ceph/osd/ceph-0
Running command: /bin/ln -snf /dev/mapper/2vBGzI-CW1R-yha9-Cg9C-hUdZ-wIJE-elllRU /var/lib/ceph/osd/ceph-0/journal
Running command: /bin/chown -R ceph:ceph /dev/mapper/2vBGzI-CW1R-yha9-Cg9C-hUdZ-wIJE-elllRU
stderr: /bin/chown: cannot access '/dev/mapper/2vBGzI-CW1R-yha9-Cg9C-hUdZ-wIJE-elllRU': No such file or directory
-
> Was unable to complete a new OSD, will rollback changes
Running command: /bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd purge-new osd.0 --yes-i-really-mean-it
stderr: purged osd.0
Traceback (most recent call last):
File "/sbin/ceph-volume", line 11, in <module>
load_entry_point('ceph-volume==1.0.0', 'console_scripts', 'ceph-volume')()
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 40, in init
self.main(self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 59, in newfunc
return f(a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/main.py", line 152, in main
terminal.dispatch(self.mapper, subcommand_args)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/main.py", line 42, in main
terminal.dispatch(self.mapper, self.argv)
File "/usr/lib/python3.6/site-packages/ceph_volume/terminal.py", line 194, in dispatch
instance.main()
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 415, in main
self._execute(plan)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/batch.py", line 434, in _execute
c.create(argparse.Namespace(
*args))
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/create.py", line 32, in create
Activate([]).activate(args)
File "/usr/lib/python3.6/site-packages/ceph_volume/decorators.py", line 16, in is_root
return func(*a, **kw)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py", line 296, in activate
activate_filestore(lvs, args.no_systemd)
File "/usr/lib/python3.6/site-packages/ceph_volume/devices/lvm/activate.py", line 85, in activate_filestore
system.chown(osd_journal)
File "/usr/lib/python3.6/site-packages/ceph_volume/util/system.py", line 123, in chown
process.run(['chown', '-R', 'ceph:ceph', path])
File "/usr/lib/python3.6/site-packages/ceph_volume/process.py", line 153, in run
raise RuntimeError(msg)
RuntimeError: command returned non-zero exit status: 1


Related issues 2 (0 open2 closed)

Copied to ceph-volume - Backport #48302: nautilus: ceph-volume lvm batch fails activating filestore dymcrypt osdsResolvedJan FajerskiActions
Copied to ceph-volume - Backport #48303: octopus: ceph-volume lvm batch fails activating filestore dymcrypt osdsResolvedJan FajerskiActions
Actions #1

Updated by Jan Fajerski over 3 years ago

Ok so the root cause of this bug then is something like this?

The ceph-osd --mkfs call with a journal fails for some reason and then activate can't retrieve the journal uuid correctly and fails?

Actions #2

Updated by Guillaume Abrioux over 3 years ago

When the journal device is prepared, the uuid set in `tags['ceph.journal_uuid']` currently refers to the uuid generated for the osd name.
tags['ceph.journal_uuid'] should point to its corresponding LV lv_uuid instead, by the way, this is what is done for the data device so we should be consistent.

The variable name `uuid` used in the code is probably too confusing.

Actions #3

Updated by Guillaume Abrioux over 3 years ago

  • Pull request ID set to 38147
Actions #4

Updated by Guillaume Abrioux over 3 years ago

  • Status changed from In Progress to Fix Under Review
Actions #5

Updated by Jan Fajerski over 3 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to octopus,nautilus
Actions #6

Updated by Jan Fajerski over 3 years ago

  • Copied to Backport #48302: nautilus: ceph-volume lvm batch fails activating filestore dymcrypt osds added
Actions #7

Updated by Jan Fajerski over 3 years ago

  • Copied to Backport #48303: octopus: ceph-volume lvm batch fails activating filestore dymcrypt osds added
Actions #8

Updated by Nathan Cutler over 3 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF