Project

General

Profile

Bug #23918

"ceph-volume lvm prepare" errors with "no valid command found"

Added by Brian Woods over 4 years ago. Updated over 4 years ago.

Status:
Closed
Priority:
Normal
Assignee:
-
Target version:
% Done:

0%

Source:
Community (user)
Tags:
ceph-volume
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Brand new install on Ubuntu 16.04:

~$ sudo ceph-volume lvm prepare --bluestore --data /dev/sdb --osd-id 8
Running command: /usr/bin/ceph-authtool --gen-print-key
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring osd tree -f json
Running command: /usr/bin/ceph --cluster ceph --name client.bootstrap-osd --keyring /var/lib/ceph/bootstrap-osd/ceph.keyring -i - osd new b78ce743-644b-45bc-a01d-89abc167f1d8
 stderr: no valid command found; 10 closest matches:
 stderr: osd setmaxosd <int[0-]>
 stderr: osd pause
 stderr: osd crush rule rm <name>
 stderr: osd crush tree
 stderr: osd crush rule create-simple <name> <root> <type> {firstn|indep}
 stderr: osd crush rule create-erasure <name> {<profile>}
 stderr: osd crush get-tunable straw_calc_version
 stderr: osd crush show-tunables
 stderr: osd crush tunables legacy|argonaut|bobtail|firefly|hammer|jewel|optimal|default
 stderr: osd crush set-tunable straw_calc_version <int>
 stderr: Error EINVAL: invalid command
-->  RuntimeError: Unable to create a new OSD id

Also, ceph osd purge returns the same error:

~$ sudo ceph osd purge 0 --yes-i-really-mean-it
no valid command found; 10 closest matches:
osd setmaxosd <int[0-]>
osd pause
osd crush rule rm <name>
osd crush tree
osd crush rule create-simple <name> <root> <type> {firstn|indep}
osd crush rule create-erasure <name> {<profile>}
osd crush get-tunable straw_calc_version
osd crush show-tunables
osd crush tunables legacy|argonaut|bobtail|firefly|hammer|jewel|optimal|default
osd crush set-tunable straw_calc_version <int>
Error EINVAL: invalid command

Only thing that was at all off, was my initial install did not include "--release luminous" and I had to re-run the installer with that option.

I have not successfully created any osds yet:

~$ sudo ceph -s
    cluster 4ff86631-a50c-4a9c-b63a-2bf40cc60642
     health HEALTH_ERR
            clock skew detected on mon.Ceph-C2
            64 pgs are stuck inactive for more than 300 seconds
            64 pgs stuck inactive
            64 pgs stuck unclean
            noout flag(s) set
            Monitor clock skew detected 
     monmap e1: 3 mons at {Ceph-C1=10.26.12.119:6789/0,Ceph-C2=10.26.12.120:6789/0,Ceph-C3=10.26.12.121:6789/0}
            election epoch 4, quorum 0,1,2 Ceph-C1,Ceph-C2,Ceph-C3
     osdmap e6: 1 osds: 0 up, 0 in
            flags noout,sortbitwise,require_jewel_osds
      pgmap v7: 64 pgs, 1 pools, 0 bytes data, 0 objects
            0 kB used, 0 kB / 0 kB avail
                  64 creating

I started to create one manually, but those instructions where not for bluestore and I stopped after creating the id (0), that is when I noticed the purge command errors as well.

I am also not sure why it is reporting clock skew. There isn't any...

Did I break something re-running the "ceph-deploy install --release luminous" command over the old install?

History

#1 Updated by Nathan Cutler over 4 years ago

  • Project changed from Ceph to ceph-volume

#2 Updated by Alfredo Deza over 4 years ago

  • Status changed from New to Need More Info

It is entirely possible that you are trying to use ceph-volume with a version of Ceph that doesn't support these options. Ensure that you only have packages for 12.2.5, and there is nothing else in that box except for that version.

If unable to correctly determine this, start fresh with a new box with 12.2.5 only. We do test thoroughly to be able to create OSDs in this way, and we can't reproduce this issue at all.

#3 Updated by Brian Woods over 4 years ago

All nodes look like this:

$ dpkg -l | grep -i ceph
ii  ceph                                 12.2.5-1xenial                             amd64        distributed storage and file system
ii  ceph-base                            12.2.5-1xenial                             amd64        common ceph daemon libraries and management tools
ii  ceph-common                          12.2.5-1xenial                             amd64        common utilities to mount and interact with a ceph storage cluster
ii  ceph-deploy                          2.0.0                                      all          Ceph-deploy is an easy to use configuration tool
ii  ceph-mds                             12.2.5-1xenial                             amd64        metadata server for the ceph distributed file system
ii  ceph-mgr                             12.2.5-1xenial                             amd64        manager for the ceph distributed storage system
ii  ceph-mon                             12.2.5-1xenial                             amd64        monitor server for the ceph storage system
ii  ceph-osd                             12.2.5-1xenial                             amd64        OSD server for the ceph storage system
ii  libcephfs1                           10.2.10-1xenial                            amd64        Ceph distributed file system client library
ii  libcephfs2                           12.2.5-1xenial                             amd64        Ceph distributed file system client library
ii  python-cephfs                        12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph libcephfs library
ii  python-rados                         12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph librados library
ii  python-rbd                           12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph librbd library
ii  python-rgw                           12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph librgw library

Should I do any other checks?

#4 Updated by Alfredo Deza over 4 years ago

This looks like you have a mix of packages:

ii  libcephfs1                           10.2.10-1xenial                            amd64        Ceph distributed file system client library

I wonder how you ended up with a 10.2.10 in there? I think you need to really clean up that machine and ensure that all packages are really the version that you want to work with.

#5 Updated by Brian Woods over 4 years ago

TLDR; Purge and re-install worked.

cd cluster
ceph-deploy purge Ceph-C1 Ceph-C2 Ceph-C3 Ceph-C4 Ceph-C5 Ceph-C6 Ceph-C7 Ceph-C8
ceph-deploy purgedata Ceph-C1 Ceph-C2 Ceph-C3 Ceph-C4 Ceph-C5 Ceph-C6 Ceph-C7 Ceph-C8
ceph-deploy uninstall Ceph-C1 Ceph-C2 Ceph-C3 Ceph-C4 Ceph-C5 Ceph-C6 Ceph-C7 Ceph-C8
ceph-deploy forgetkeys
rm ceph.*
$ dpkg -l | grep -i ceph
ii  ceph-deploy                          2.0.0                                      all          Ceph-deploy is an easy to use configuration tool
ii  libcephfs1                           10.2.10-1xenial                            amd64        Ceph distributed file system client library
ii  libcephfs2                           12.2.5-1xenial                             amd64        Ceph distributed file system client library
ii  python-cephfs                        12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph libcephfs library
ii  python-rados                         12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph librados library
ii  python-rbd                           12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph librbd library
ii  python-rgw                           12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph librgw library

Hmmm, so I ran this on each node (I left ceph-deploy):

sudo apt remove libcephfs1 libcephfs1 libcephfs2 python-cephfs python-rados python-rbd python-rgw -y
$ dpkg -l | grep -i ceph
ii  ceph-deploy                          2.0.0                                      all          Ceph-deploy is an easy to use configuration tool

Looking better, then I confirmed the package source on each node:

$ cat /etc/apt/sources.list.d/ceph.list 
deb https://download.ceph.com/debian-luminous/ xenial main

Reinstall...

ceph-deploy new Ceph-C1 Ceph-C2 Ceph-C3

Added public_network to my ceph.conf

ceph-deploy install --release luminous Ceph-C1 Ceph-C2 Ceph-C3 Ceph-C4 Ceph-C5 Ceph-C6 Ceph-C7 Ceph-C8

Now showing:

$ dpkg -l | grep -i ceph
ii  ceph                                 12.2.5-1xenial                             amd64        distributed storage and file system
ii  ceph-base                            12.2.5-1xenial                             amd64        common ceph daemon libraries and management tools
ii  ceph-common                          12.2.5-1xenial                             amd64        common utilities to mount and interact with a ceph storage cluster
ii  ceph-deploy                          2.0.0                                      all          Ceph-deploy is an easy to use configuration tool
ii  ceph-mds                             12.2.5-1xenial                             amd64        metadata server for the ceph distributed file system
ii  ceph-mgr                             12.2.5-1xenial                             amd64        manager for the ceph distributed storage system
ii  ceph-mon                             12.2.5-1xenial                             amd64        monitor server for the ceph storage system
ii  ceph-osd                             12.2.5-1xenial                             amd64        OSD server for the ceph storage system
ii  libcephfs2                           12.2.5-1xenial                             amd64        Ceph distributed file system client library
ii  python-cephfs                        12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph libcephfs library
ii  python-rados                         12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph librados library
ii  python-rbd                           12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph librbd library
ii  python-rgw                           12.2.5-1xenial                             amd64        Python 2 libraries for the Ceph librgw library

ceph-deploy mon create-initial
ceph-deploy admin Ceph-C1 Ceph-C2 Ceph-C3 Ceph-C4 Ceph-C5 Ceph-C6 Ceph-C7 Ceph-C8
$ sudo ceph -s
  cluster:
    id:     ca9b08d2-b413-4a74-8f7b-1118987e5ae5
    health: HEALTH_WARN
            clock skew detected on mon.Ceph-C2, mon.Ceph-C3

  services:
    mon: 3 daemons, quorum Ceph-C1,Ceph-C2,Ceph-C3
    mgr: no daemons active
    osd: 0 osds: 0 up, 0 in

  data:
    pools:   0 pools, 0 pgs
    objects: 0 objects, 0 bytes
    usage:   0 kB used, 0 kB / 0 kB avail
    pgs:     

Yay!

Had to copy the keys out for some reason...

rsync ceph.bootstrap-osd.keyring root@Ceph-C1:/var/lib/ceph/bootstrap-osd/ceph.keyring
rsync ceph.bootstrap-osd.keyring root@Ceph-C2:/var/lib/ceph/bootstrap-osd/ceph.keyring
rsync ceph.bootstrap-osd.keyring root@Ceph-C3:/var/lib/ceph/bootstrap-osd/ceph.keyring
rsync ceph.bootstrap-osd.keyring root@Ceph-C4:/var/lib/ceph/bootstrap-osd/ceph.keyring
rsync ceph.bootstrap-osd.keyring root@Ceph-C5:/var/lib/ceph/bootstrap-osd/ceph.keyring
rsync ceph.bootstrap-osd.keyring root@Ceph-C6:/var/lib/ceph/bootstrap-osd/ceph.keyring
rsync ceph.bootstrap-osd.keyring root@Ceph-C7:/var/lib/ceph/bootstrap-osd/ceph.keyring
rsync ceph.bootstrap-osd.keyring root@Ceph-C8:/var/lib/ceph/bootstrap-osd/ceph.keyring

Then on each node:

sudo ceph-volume lvm create --bluestore --data /dev/sdb

And I have a happy cluster!

I did not manually install those packages, so something in the ceph-deploy process did...

Also, is there a reason that ceph-deploy doesn't default to the latest stable?

Anyhow, thanks!

#6 Updated by Alfredo Deza over 4 years ago

  • Status changed from Need More Info to Closed

Also available in: Atom PDF