Bug #23918
"ceph-volume lvm prepare" errors with "no valid command found"
0%
Description
Brand new install on Ubuntu 16.04:
~$ sudo ceph-volume lvm prepare --bluestore --data /dev/sdb --osd-id 8 Running command: /usr/bin/ceph-authtool --gen-print-key Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new b78ce743-644b-45bc-a01d-89abc167f1d8 stderr: no valid command found; 10 closest matches: stderr: osd setmaxosd <int[0-]> stderr: osd pause stderr: osd crush rule rm <name> stderr: osd crush tree stderr: osd crush rule create-simple <name> <root> <type> {firstn|indep} stderr: osd crush rule create-erasure <name> {<profile>} stderr: osd crush get-tunable straw_calc_version stderr: osd crush show-tunables stderr: osd crush tunables legacy|argonaut|bobtail|firefly|hammer|jewel|optimal|default stderr: osd crush set-tunable straw_calc_version <int> stderr: Error EINVAL: invalid command --> RuntimeError: Unable to create a new OSD id
Also, ceph osd purge returns the same error:
~$ sudo ceph osd purge 0 --yes-i-really-mean-it no valid command found; 10 closest matches: osd setmaxosd <int[0-]> osd pause osd crush rule rm <name> osd crush tree osd crush rule create-simple <name> <root> <type> {firstn|indep} osd crush rule create-erasure <name> {<profile>} osd crush get-tunable straw_calc_version osd crush show-tunables osd crush tunables legacy|argonaut|bobtail|firefly|hammer|jewel|optimal|default osd crush set-tunable straw_calc_version <int> Error EINVAL: invalid command
Only thing that was at all off, was my initial install did not include "--release luminous" and I had to re-run the installer with that option.
I have not successfully created any osds yet:
~$ sudo ceph -s cluster 4ff86631-a50c-4a9c-b63a-2bf40cc60642 health HEALTH_ERR clock skew detected on mon.Ceph-C2 64 pgs are stuck inactive for more than 300 seconds 64 pgs stuck inactive 64 pgs stuck unclean noout flag(s) set Monitor clock skew detected monmap e1: 3 mons at {Ceph-C1=10.26.12.119:6789/0,Ceph-C2=10.26.12.120:6789/0,Ceph-C3=10.26.12.121:6789/0} election epoch 4, quorum 0,1,2 Ceph-C1,Ceph-C2,Ceph-C3 osdmap e6: 1 osds: 0 up, 0 in flags noout,sortbitwise,require_jewel_osds pgmap v7: 64 pgs, 1 pools, 0 bytes data, 0 objects 0 kB used, 0 kB / 0 kB avail 64 creating
I started to create one manually, but those instructions where not for bluestore and I stopped after creating the id (0), that is when I noticed the purge command errors as well.
I am also not sure why it is reporting clock skew. There isn't any...
Did I break something re-running the "ceph-deploy install --release luminous" command over the old install?
History
#1 Updated by Nathan Cutler almost 6 years ago
- Project changed from Ceph to ceph-volume
#2 Updated by Alfredo Deza almost 6 years ago
- Status changed from New to Need More Info
It is entirely possible that you are trying to use ceph-volume with a version of Ceph that doesn't support these options. Ensure that you only have packages for 12.2.5, and there is nothing else in that box except for that version.
If unable to correctly determine this, start fresh with a new box with 12.2.5 only. We do test thoroughly to be able to create OSDs in this way, and we can't reproduce this issue at all.
#3 Updated by Brian Woods almost 6 years ago
All nodes look like this:
$ dpkg -l | grep -i ceph ii ceph 12.2.5-1xenial amd64 distributed storage and file system ii ceph-base 12.2.5-1xenial amd64 common ceph daemon libraries and management tools ii ceph-common 12.2.5-1xenial amd64 common utilities to mount and interact with a ceph storage cluster ii ceph-deploy 2.0.0 all Ceph-deploy is an easy to use configuration tool ii ceph-mds 12.2.5-1xenial amd64 metadata server for the ceph distributed file system ii ceph-mgr 12.2.5-1xenial amd64 manager for the ceph distributed storage system ii ceph-mon 12.2.5-1xenial amd64 monitor server for the ceph storage system ii ceph-osd 12.2.5-1xenial amd64 OSD server for the ceph storage system ii libcephfs1 10.2.10-1xenial amd64 Ceph distributed file system client library ii libcephfs2 12.2.5-1xenial amd64 Ceph distributed file system client library ii python-cephfs 12.2.5-1xenial amd64 Python 2 libraries for the Ceph libcephfs library ii python-rados 12.2.5-1xenial amd64 Python 2 libraries for the Ceph librados library ii python-rbd 12.2.5-1xenial amd64 Python 2 libraries for the Ceph librbd library ii python-rgw 12.2.5-1xenial amd64 Python 2 libraries for the Ceph librgw library
Should I do any other checks?
#4 Updated by Alfredo Deza almost 6 years ago
This looks like you have a mix of packages:
ii libcephfs1 10.2.10-1xenial amd64 Ceph distributed file system client library
I wonder how you ended up with a 10.2.10 in there? I think you need to really clean up that machine and ensure that all packages are really the version that you want to work with.
#5 Updated by Brian Woods almost 6 years ago
TLDR; Purge and re-install worked.
cd cluster ceph-deploy purge Ceph-C1 Ceph-C2 Ceph-C3 Ceph-C4 Ceph-C5 Ceph-C6 Ceph-C7 Ceph-C8 ceph-deploy purgedata Ceph-C1 Ceph-C2 Ceph-C3 Ceph-C4 Ceph-C5 Ceph-C6 Ceph-C7 Ceph-C8 ceph-deploy uninstall Ceph-C1 Ceph-C2 Ceph-C3 Ceph-C4 Ceph-C5 Ceph-C6 Ceph-C7 Ceph-C8 ceph-deploy forgetkeys rm ceph.*
$ dpkg -l | grep -i ceph ii ceph-deploy 2.0.0 all Ceph-deploy is an easy to use configuration tool ii libcephfs1 10.2.10-1xenial amd64 Ceph distributed file system client library ii libcephfs2 12.2.5-1xenial amd64 Ceph distributed file system client library ii python-cephfs 12.2.5-1xenial amd64 Python 2 libraries for the Ceph libcephfs library ii python-rados 12.2.5-1xenial amd64 Python 2 libraries for the Ceph librados library ii python-rbd 12.2.5-1xenial amd64 Python 2 libraries for the Ceph librbd library ii python-rgw 12.2.5-1xenial amd64 Python 2 libraries for the Ceph librgw library
Hmmm, so I ran this on each node (I left ceph-deploy):
sudo apt remove libcephfs1 libcephfs1 libcephfs2 python-cephfs python-rados python-rbd python-rgw -y
$ dpkg -l | grep -i ceph ii ceph-deploy 2.0.0 all Ceph-deploy is an easy to use configuration tool
Looking better, then I confirmed the package source on each node:
$ cat /etc/apt/sources.list.d/ceph.list deb https://download.ceph.com/debian-luminous/ xenial main
Reinstall...
ceph-deploy new Ceph-C1 Ceph-C2 Ceph-C3
Added public_network to my ceph.conf
ceph-deploy install --release luminous Ceph-C1 Ceph-C2 Ceph-C3 Ceph-C4 Ceph-C5 Ceph-C6 Ceph-C7 Ceph-C8
Now showing:
$ dpkg -l | grep -i ceph ii ceph 12.2.5-1xenial amd64 distributed storage and file system ii ceph-base 12.2.5-1xenial amd64 common ceph daemon libraries and management tools ii ceph-common 12.2.5-1xenial amd64 common utilities to mount and interact with a ceph storage cluster ii ceph-deploy 2.0.0 all Ceph-deploy is an easy to use configuration tool ii ceph-mds 12.2.5-1xenial amd64 metadata server for the ceph distributed file system ii ceph-mgr 12.2.5-1xenial amd64 manager for the ceph distributed storage system ii ceph-mon 12.2.5-1xenial amd64 monitor server for the ceph storage system ii ceph-osd 12.2.5-1xenial amd64 OSD server for the ceph storage system ii libcephfs2 12.2.5-1xenial amd64 Ceph distributed file system client library ii python-cephfs 12.2.5-1xenial amd64 Python 2 libraries for the Ceph libcephfs library ii python-rados 12.2.5-1xenial amd64 Python 2 libraries for the Ceph librados library ii python-rbd 12.2.5-1xenial amd64 Python 2 libraries for the Ceph librbd library ii python-rgw 12.2.5-1xenial amd64 Python 2 libraries for the Ceph librgw library
ceph-deploy mon create-initial ceph-deploy admin Ceph-C1 Ceph-C2 Ceph-C3 Ceph-C4 Ceph-C5 Ceph-C6 Ceph-C7 Ceph-C8
$ sudo ceph -s cluster: id: ca9b08d2-b413-4a74-8f7b-1118987e5ae5 health: HEALTH_WARN clock skew detected on mon.Ceph-C2, mon.Ceph-C3 services: mon: 3 daemons, quorum Ceph-C1,Ceph-C2,Ceph-C3 mgr: no daemons active osd: 0 osds: 0 up, 0 in data: pools: 0 pools, 0 pgs objects: 0 objects, 0 bytes usage: 0 kB used, 0 kB / 0 kB avail pgs:
Yay!
Had to copy the keys out for some reason...
rsync ceph.bootstrap-osd.keyring root@Ceph-C1:/var/lib/ceph/bootstrap-osd/ceph.keyring rsync ceph.bootstrap-osd.keyring root@Ceph-C2:/var/lib/ceph/bootstrap-osd/ceph.keyring rsync ceph.bootstrap-osd.keyring root@Ceph-C3:/var/lib/ceph/bootstrap-osd/ceph.keyring rsync ceph.bootstrap-osd.keyring root@Ceph-C4:/var/lib/ceph/bootstrap-osd/ceph.keyring rsync ceph.bootstrap-osd.keyring root@Ceph-C5:/var/lib/ceph/bootstrap-osd/ceph.keyring rsync ceph.bootstrap-osd.keyring root@Ceph-C6:/var/lib/ceph/bootstrap-osd/ceph.keyring rsync ceph.bootstrap-osd.keyring root@Ceph-C7:/var/lib/ceph/bootstrap-osd/ceph.keyring rsync ceph.bootstrap-osd.keyring root@Ceph-C8:/var/lib/ceph/bootstrap-osd/ceph.keyring
Then on each node:
sudo ceph-volume lvm create --bluestore --data /dev/sdb
And I have a happy cluster!
I did not manually install those packages, so something in the ceph-deploy process did...
Also, is there a reason that ceph-deploy doesn't default to the latest stable?
Anyhow, thanks!
#6 Updated by Alfredo Deza almost 6 years ago
- Status changed from Need More Info to Closed