Actions
Bug #16477
closedceph cli: Rados object in state configuring race
% Done:
0%
Source:
other
Tags:
Backport:
jewel
Regression:
No
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
ceph-deploy
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
On a clean install using the Jewel packages for 10.2.2 on Ubuntu 1604 I'm running into the same problem as described in #16379.
$ sudo dpkg -l | grep ceph ii ceph 10.2.2-1xenial amd64 distributed storage and file system ii ceph-base 10.2.2-1xenial amd64 common ceph daemon libraries and management tools ii ceph-common 10.2.2-1xenial amd64 common utilities to mount and interact with a ceph storage cluster ii ceph-deploy 1.5.34 all Ceph-deploy is an easy to use configuration tool ii ceph-mds 10.2.2-1xenial amd64 metadata server for the ceph distributed file system ii ceph-mon 10.2.2-1xenial amd64 monitor server for the ceph storage system ii ceph-osd 10.2.2-1xenial amd64 OSD server for the ceph storage system ii libcephfs1 10.2.2-1xenial amd64 Ceph distributed file system client library ii python-cephfs 10.2.2-1xenial amd64 Python libraries for the Ceph libcephfs library
Running ceph-deploy mon create-initial
results in:
[2016-06-24 16:03:45,628][ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephir/.cephdeploy.conf [2016-06-24 16:03:45,629][ceph_deploy.cli][INFO ] Invoked (1.5.34): /usr/bin/ceph-deploy mon create-initial [2016-06-24 16:03:45,629][ceph_deploy.cli][INFO ] ceph-deploy options: [2016-06-24 16:03:45,629][ceph_deploy.cli][INFO ] username : None [2016-06-24 16:03:45,629][ceph_deploy.cli][INFO ] verbose : False [2016-06-24 16:03:45,629][ceph_deploy.cli][INFO ] overwrite_conf : False [2016-06-24 16:03:45,630][ceph_deploy.cli][INFO ] subcommand : create-initial [2016-06-24 16:03:45,630][ceph_deploy.cli][INFO ] quiet : False [2016-06-24 16:03:45,630][ceph_deploy.cli][INFO ] cd_conf : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7fe6fd4decf8> [2016-06-24 16:03:45,630][ceph_deploy.cli][INFO ] cluster : ceph [2016-06-24 16:03:45,630][ceph_deploy.cli][INFO ] func : <function mon at 0x7fe6fd4bf140> [2016-06-24 16:03:45,630][ceph_deploy.cli][INFO ] ceph_conf : None [2016-06-24 16:03:45,631][ceph_deploy.cli][INFO ] keyrings : None [2016-06-24 16:03:45,631][ceph_deploy.cli][INFO ] default_release : False [2016-06-24 16:03:45,632][ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-mon-01 [2016-06-24 16:03:45,632][ceph_deploy.mon][DEBUG ] detecting platform for host ceph-mon-01 ... [2016-06-24 16:03:46,037][ceph-mon-01][DEBUG ] connection detected need for sudo [2016-06-24 16:03:46,404][ceph-mon-01][DEBUG ] connected to host: ceph-mon-01 [2016-06-24 16:03:46,405][ceph-mon-01][DEBUG ] detect platform information from remote host [2016-06-24 16:03:46,467][ceph-mon-01][DEBUG ] detect machine type [2016-06-24 16:03:46,474][ceph-mon-01][DEBUG ] find the location of an executable [2016-06-24 16:03:46,475][ceph_deploy.mon][INFO ] distro info: Ubuntu 16.04 xenial [2016-06-24 16:03:46,476][ceph-mon-01][DEBUG ] determining if provided host has same hostname in remote [2016-06-24 16:03:46,476][ceph-mon-01][DEBUG ] get remote short hostname [2016-06-24 16:03:46,477][ceph-mon-01][DEBUG ] deploying mon to ceph-mon-01 [2016-06-24 16:03:46,478][ceph-mon-01][DEBUG ] get remote short hostname [2016-06-24 16:03:46,479][ceph-mon-01][DEBUG ] remote hostname: ceph-mon-01 [2016-06-24 16:03:46,482][ceph-mon-01][DEBUG ] write cluster configuration to /etc/ceph/{cluster}.conf [2016-06-24 16:03:46,485][ceph-mon-01][DEBUG ] create the mon path if it does not exist [2016-06-24 16:03:46,487][ceph-mon-01][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-mon-01/done [2016-06-24 16:03:46,488][ceph-mon-01][DEBUG ] create a done file to avoid re-doing the mon deployment [2016-06-24 16:03:46,489][ceph-mon-01][DEBUG ] create the init path if it does not exist [2016-06-24 16:03:46,493][ceph-mon-01][INFO ] Running command: sudo systemctl enable ceph.target [2016-06-24 16:03:46,617][ceph-mon-01][INFO ] Running command: sudo systemctl enable ceph-mon@ceph-mon-01 [2016-06-24 16:03:46,788][ceph-mon-01][INFO ] Running command: sudo systemctl start ceph-mon@ceph-mon-01 [2016-06-24 16:03:48,814][ceph-mon-01][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon-01.asok mon_status [2016-06-24 16:03:48,980][ceph-mon-01][DEBUG ] ******************************************************************************** [2016-06-24 16:03:48,980][ceph-mon-01][DEBUG ] status for monitor: mon.ceph-mon-01 [2016-06-24 16:03:48,981][ceph-mon-01][DEBUG ] { [2016-06-24 16:03:48,981][ceph-mon-01][DEBUG ] "election_epoch": 3, [2016-06-24 16:03:48,981][ceph-mon-01][DEBUG ] "extra_probe_peers": [ [2016-06-24 16:03:48,981][ceph-mon-01][DEBUG ] "172.29.50.231:6789/0" [2016-06-24 16:03:48,981][ceph-mon-01][DEBUG ] ], [2016-06-24 16:03:48,981][ceph-mon-01][DEBUG ] "monmap": { [2016-06-24 16:03:48,982][ceph-mon-01][DEBUG ] "created": "2016-06-24 16:02:37.367266", [2016-06-24 16:03:48,982][ceph-mon-01][DEBUG ] "epoch": 1, [2016-06-24 16:03:48,982][ceph-mon-01][DEBUG ] "fsid": "76849e1f-1002-4add-ab2e-a8da7d163ed0", [2016-06-24 16:03:48,982][ceph-mon-01][DEBUG ] "modified": "2016-06-24 16:02:37.367266", [2016-06-24 16:03:48,982][ceph-mon-01][DEBUG ] "mons": [ [2016-06-24 16:03:48,982][ceph-mon-01][DEBUG ] { [2016-06-24 16:03:48,983][ceph-mon-01][DEBUG ] "addr": "172.28.50.231:6789/0", [2016-06-24 16:03:48,983][ceph-mon-01][DEBUG ] "name": "ceph-mon-01", [2016-06-24 16:03:48,983][ceph-mon-01][DEBUG ] "rank": 0 [2016-06-24 16:03:48,983][ceph-mon-01][DEBUG ] } [2016-06-24 16:03:48,983][ceph-mon-01][DEBUG ] ] [2016-06-24 16:03:48,983][ceph-mon-01][DEBUG ] }, [2016-06-24 16:03:48,984][ceph-mon-01][DEBUG ] "name": "ceph-mon-01", [2016-06-24 16:03:48,984][ceph-mon-01][DEBUG ] "outside_quorum": [], [2016-06-24 16:03:48,984][ceph-mon-01][DEBUG ] "quorum": [ [2016-06-24 16:03:48,984][ceph-mon-01][DEBUG ] 0 [2016-06-24 16:03:48,984][ceph-mon-01][DEBUG ] ], [2016-06-24 16:03:48,984][ceph-mon-01][DEBUG ] "rank": 0, [2016-06-24 16:03:48,985][ceph-mon-01][DEBUG ] "state": "leader", [2016-06-24 16:03:48,985][ceph-mon-01][DEBUG ] "sync_provider": [] [2016-06-24 16:03:48,985][ceph-mon-01][DEBUG ] } [2016-06-24 16:03:48,985][ceph-mon-01][DEBUG ] ******************************************************************************** [2016-06-24 16:03:48,985][ceph-mon-01][INFO ] monitor: mon.ceph-mon-01 is running [2016-06-24 16:03:48,988][ceph-mon-01][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon-01.asok mon_status [2016-06-24 16:03:49,154][ceph_deploy.mon][INFO ] processing monitor mon.ceph-mon-01 [2016-06-24 16:03:49,557][ceph-mon-01][DEBUG ] connection detected need for sudo [2016-06-24 16:03:49,932][ceph-mon-01][DEBUG ] connected to host: ceph-mon-01 [2016-06-24 16:03:49,933][ceph-mon-01][DEBUG ] detect platform information from remote host [2016-06-24 16:03:49,997][ceph-mon-01][DEBUG ] detect machine type [2016-06-24 16:03:50,004][ceph-mon-01][DEBUG ] find the location of an executable [2016-06-24 16:03:50,009][ceph-mon-01][INFO ] Running command: sudo ceph --cluster=ceph --admin-daemon /var/run/ceph/ceph-mon.ceph-mon-01.asok mon_status [2016-06-24 16:03:50,176][ceph_deploy.mon][INFO ] mon.ceph-mon-01 monitor has reached quorum! [2016-06-24 16:03:50,176][ceph_deploy.mon][INFO ] all initial monitors are running and have formed quorum [2016-06-24 16:03:50,176][ceph_deploy.mon][INFO ] Running gatherkeys... [2016-06-24 16:03:50,178][ceph_deploy.gatherkeys][INFO ] Storing keys in temp directory /tmp/tmpkoz_QK [2016-06-24 16:03:50,533][ceph-mon-01][DEBUG ] connection detected need for sudo [2016-06-24 16:03:50,888][ceph-mon-01][DEBUG ] connected to host: ceph-mon-01 [2016-06-24 16:03:50,889][ceph-mon-01][DEBUG ] detect platform information from remote host [2016-06-24 16:03:50,949][ceph-mon-01][DEBUG ] detect machine type [2016-06-24 16:03:50,955][ceph-mon-01][DEBUG ] get remote short hostname [2016-06-24 16:03:50,957][ceph-mon-01][DEBUG ] fetch remote file [2016-06-24 16:03:50,960][ceph-mon-01][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --admin-daemon=/var/run/ceph/ceph-mon.ceph-mon-01.asok mon_status [2016-06-24 16:03:51,129][ceph-mon-01][INFO ] Running command: sudo /usr/bin/ceph --connect-timeout=25 --cluster=ceph --name mon. --keyring=/var/lib/ceph/mon/ceph-ceph-mon-01/keyring auth get-or-create client.admin osd allow * mds allow * mon allow * [2016-06-24 16:04:16,286][ceph-mon-01][ERROR ] "ceph auth get-or-create for keytype admin returned 1 [2016-06-24 16:04:16,286][ceph-mon-01][DEBUG ] 2016-06-24 16:03:51.257412 7f8293d22700 0 -- :/2431738263 >> 172.29.50.231:6789/0 pipe(0x7f8298059be0 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7f8298059a20).fault [2016-06-24 16:04:16,286][ceph-mon-01][DEBUG ] 2016-06-24 16:03:54.257802 7f8293c21700 0 -- :/2431738263 >> 172.29.50.231:6789/0 pipe(0x7f8288000cc0 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7f8288002000).fault [2016-06-24 16:04:16,286][ceph-mon-01][DEBUG ] 2016-06-24 16:03:57.259150 7f8293d22700 0 -- :/2431738263 >> 172.29.50.231:6789/0 pipe(0x7f82880052c0 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7f82880065a0).fault [2016-06-24 16:04:16,286][ceph-mon-01][DEBUG ] 2016-06-24 16:04:00.258997 7f8293c21700 0 -- :/2431738263 >> 172.29.50.231:6789/0 pipe(0x7f8288000cc0 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7f82880024d0).fault [2016-06-24 16:04:16,287][ceph-mon-01][DEBUG ] 2016-06-24 16:04:03.259256 7f8293d22700 0 -- :/2431738263 >> 172.29.50.231:6789/0 pipe(0x7f82880052c0 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7f8288002ff0).fault [2016-06-24 16:04:16,287][ceph-mon-01][DEBUG ] 2016-06-24 16:04:06.259529 7f8293c21700 0 -- :/2431738263 >> 172.29.50.231:6789/0 pipe(0x7f8288000cc0 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7f8288003610).fault [2016-06-24 16:04:16,287][ceph-mon-01][DEBUG ] 2016-06-24 16:04:09.260115 7f8293d22700 0 -- :/2431738263 >> 172.29.50.231:6789/0 pipe(0x7f82880052c0 sd=6 :0 s=1 pgs=0 cs=0 l=1 c=0x7f8288004340).fault [2016-06-24 16:04:16,287][ceph-mon-01][DEBUG ] 2016-06-24 16:04:12.260518 7f8293c21700 0 -- :/2431738263 >> 172.29.50.231:6789/0 pipe(0x7f8288000cc0 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7f8288008fd0).fault [2016-06-24 16:04:16,287][ceph-mon-01][DEBUG ] 2016-06-24 16:04:15.260927 7f8293d22700 0 -- :/2431738263 >> 172.29.50.231:6789/0 pipe(0x7f82880052c0 sd=5 :0 s=1 pgs=0 cs=0 l=1 c=0x7f8288009bb0).fault [2016-06-24 16:04:16,287][ceph-mon-01][DEBUG ] Traceback (most recent call last): [2016-06-24 16:04:16,287][ceph-mon-01][DEBUG ] File "/usr/bin/ceph", line 948, in <module> [2016-06-24 16:04:16,288][ceph-mon-01][DEBUG ] retval = main() [2016-06-24 16:04:16,288][ceph-mon-01][DEBUG ] File "/usr/bin/ceph", line 852, in main [2016-06-24 16:04:16,288][ceph-mon-01][DEBUG ] prefix='get_command_descriptions') [2016-06-24 16:04:16,288][ceph-mon-01][DEBUG ] File "/usr/lib/python2.7/dist-packages/ceph_argparse.py", line 1291, in json_command [2016-06-24 16:04:16,288][ceph-mon-01][DEBUG ] raise RuntimeError('"{0}": exception {1}'.format(argdict, e)) [2016-06-24 16:04:16,288][ceph-mon-01][DEBUG ] RuntimeError: "None": exception "['{"prefix": "get_command_descriptions"}']": exception You cannot perform that operation on a Rados object in state configuring. [2016-06-24 16:04:16,290][ceph_deploy.gatherkeys][ERROR ] Failed to connect to host:ceph-mon-01 [2016-06-24 16:04:16,290][ceph_deploy.gatherkeys][INFO ] Destroy temp directory /tmp/tmpkoz_QK [2016-06-24 16:04:16,290][ceph_deploy][ERROR ] RuntimeError: Failed to connect any mon
Files
Updated by Loïc Dachary over 7 years ago
- File log.gz log.gz added
- Subject changed from Fresh install of Jewel 10.2.2 fails on Ubuntu 1604 due to #16379 to ceph cli: Rados object in state configuring race
- Status changed from New to 12
I think this is a race condition. It happened during a test as well and is apparently rare.
/home/jenkins-build/build/workspace/ceph-pull-requests/qa/workunits/ceph-helpers.sh:217: test_kill_daemon: ceph --connect-timeout 60 status ceph-mon: mon.noname-a 127.0.0.1:7109/0 is local, renaming to mon.a ceph-mon: set fsid to 4678a81b-ece6-4d52-a5e1-5bbc64007ea4 Traceback (most recent call last): File "/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/ceph", line 949, in <module> retval = main() File "/home/jenkins-build/build/workspace/ceph-pull-requests/build/bin/ceph", line 853, in main prefix='get_command_descriptions') File "/home/jenkins-build/build/workspace/ceph-pull-requests/src/pybind/ceph_argparse.py", line 1312, in json_command raise RuntimeError('"{0}": exception {1}'.format(argdict, e)) RuntimeError: "None": exception "['{"prefix": "get_command_descriptions"}']": exception You cannot perform that operation on a Rados object in state configuring. /home/jenkins-build/build/workspace/ceph-pull-requests/qa/workunits/ceph-helpers.sh:220: test_kill_daemon: teardown testdir/ceph-helpers /home/jenkins-build/build/workspace/ceph-pull-requests/qa/workunits/ceph-helpers.sh:118: teardown: local dir=testdir/ceph-helpers /home/jenkins-build/build/workspace/ceph-pull-requests/qa/workunits/ceph-helpers.sh:119: teardown: kill_daemons testdir/ceph-helpers KILL //home/jenkins-build/build/workspace/ceph-pull-requests/qa/workunits/ceph-helpers.sh:252: kill_daemons: shopt -q -o xtrace
When it happens it looks like the ceph command returns on success although it should return on error but that's a detail.
Updated by Loïc Dachary over 7 years ago
- Status changed from 12 to Fix Under Review
Updated by Kefu Chai over 7 years ago
- Status changed from Fix Under Review to Pending Backport
- Assignee set to Loïc Dachary
Updated by Nathan Cutler over 7 years ago
- Copied to Backport #17385: jewel: ceph cli: Rados object in state configuring race added
Updated by Nathan Cutler over 6 years ago
- Status changed from Pending Backport to Resolved
Actions