Project

General

Profile

Support #12930

Teuthology Error [I started running worker_start plana 1 in one window, Next i started running teuthology-suite in another window and then the worker was taking the suite of jobs and returning an error without executing jobs across targets]

Added by sivaram ravipati over 8 years ago. Updated over 7 years ago.

Status:
Closed
Priority:
Urgent
Assignee:
-
Category:
-
% Done:

0%

Tags:
Reviewed:
Affected Versions:

Description

Prerequesits:-

1) Installed Ubuntu Servers with username ubuntu.

2. Created the below nodes

1. one ceph-teuthology

2. three ceph target nodes

3. one paddles/pulpito node.

Host Name IP (Static) UserName
1)teuthology-node 192.168.2.103 ubuntu
2)target1 192.168.2.101 ubuntu
3)target2 192.168.2.102 ubuntu
4)target3 192.168.2.104 ubuntu
5)paddles1 192.168.2.105 ubuntu

3) Edited /etc/hosts (In ceph-teuthology node)

127.0.0.1 localhost
127.0.1.1 teuthology-node
192.168.2.103 teuthology-node
192.168.2.101 target1.front.sepia.ceph.com target1
192.168.2.102 target2.front.sepia.ceph.com target2
192.168.2.104 target3.front.sepia.ceph.com target3
192.168.2.106 paddles1.front.sepia.ceph.com paddles1

target1, target2, target3 are names of the target nodes.
Used front.sepia.ceph.com as domain name

4) Made sure that the paddles and pulpito services are running on their respective ports

(virtualenv)pulpito@paddles1:~/pulpito$ sudo supervisorctl status pulpito
pulpito RUNNING pid 4589, uptime 4:27:01

(virtualenv)pulpito@paddles1:~/pulpito$ sudo supervisorctl status paddles
paddles RUNNING pid 4157, uptime 0:04:51

5. Confuigured the teuthology.yaml in teuthology-node under /etc/

lab_domain: front.sepia.ceph.com
lock_server: http://paddles1.front.sepia.ceph.com:8080
results_server: http://paddles1.front.sepia.ceph.com:8080
results_ui_server: http://paddles1.front.sepia.ceph.com:8081
queue_host: localhost
queue_port: 11300
results_email:
archive_base: /home/teuthworker/archive

teuthworker@teuthology-node:~$ ls
archive bin src

teuthworker@teuthology-node:~$ cd bin

teuthworker@teuthology-node:~/bin$ ls
worker_start

teuthworker@teuthology-node:~/bin$ chmod +x worker_start

teuthworker@teuthology-node:~$ vi ~/.teuthology.yaml

lab_domain: front.sepia.ceph.com
lock_server: http://paddles1.front.sepia.ceph.com:8080
results_server: http://paddles1.front.sepia.ceph.com:8080
results_ui_server: http://paddles1.front.sepia.ceph.com:8081
queue_host: localhost
queue_port: 11300
results_email:
archive_base: /home/teuthworker/archive

------>>>>Checking for the paddles machine is connecting to teuthology node to get the targets info indirectly.

$curl -X GET http://192.168.2.106:8080/nodes/

[]

Running the Test
-----------------

----------->>>>In the first window, started the teuthology-worker from teuthworker user of teuthology-node

teuthworker@teuthology-node:~/bin$ worker_start plana 1
plana 1
Starting 1 workers for plana
teuthworker@teuthology-node:~/bin$ 2015-08-24 12:45:30,485.485 INFO:teuthology.repo_utils:Fetching from upstream into /home/teuthworker/src/teuthol
ogy_master
2015-08-24 12:45:32,243.243 INFO:teuthology.repo_utils:Resetting repo at /home/teuthworker/src/teuthology_master to branch master
2015-08-24 12:45:32,282.282 INFO:teuthology.repo_utils:Bootstrapping /home/teuthworker/src/teuthology_master
2015-08-24 12:45:38,207.207 INFO:teuthology.repo_utils:Bootstrap exited with status 0
2015-08-24 12:45:38,208.208 INFO:teuthology.repo_utils:Cloning https://github.com/ceph/ceph-qa-suite.git master from upstream
2015-08-24 12:45:46,749.749 INFO:teuthology.repo_utils:Resetting repo at /home/teuthworker/src/ceph-qa-suite_master to branch master

---->>>>>>In second window, started running teuthology-suite from teuthology-node

ubuntu@teuthology-node:~/teuthology$ ./virtualenv/bin/teuthology-suite -s teuthology -c master -e -f basic -t cuttlefish -m plana -k
distro -p 1000

2015-08-24 14:37:52,706.706 INFO:teuthology.suite:kernel sha1: distro
2015-08-24 14:37:53,558.558 INFO:teuthology.suite:ceph sha1: 8b6208377f65c6ad5df0d5d3d1ac93fa351aff39
2015-08-24 14:37:54,379.379 INFO:teuthology.suite:ceph version: v9.0.3-1359.g8b62083
2015-08-24 14:37:55,451.451 INFO:teuthology.suite:teuthology branch: cuttlefish
2015-08-24 14:37:56,474.474 INFO:teuthology.suite:ceph-qa-suite branch: master
2015-08-24 14:37:56,476.476 INFO:teuthology.repo_utils:Fetching from upstream into /home/ubuntu/src/ceph-qa-suite_master
2015-08-24 14:37:58,244.244 INFO:teuthology.repo_utils:Resetting repo at /home/ubuntu/src/ceph-qa-suite_master to branch master
2015-08-24 14:37:58,300.300 WARNING:teuthology.suite:No machines found with machine_type plana!
2015-08-24 14:37:58,305.305 INFO:teuthology.suite:Suite teuthology in /home/ubuntu/src/ceph-qa-suite_master/suites/teuthology generated 25 jobs (n
ed out.
Job scheduled with name ubuntu-2015-08-24_14:37:52-teuthology-master-distro-basic-plana and ID 68
2015-08-24 14:38:26,919.919 INFO:teuthology.suite:Test results viewable at http://192.168.2.106:8081/ubuntu-2015-08-24_14:37:52-teuthology-master-d
istro-basic-plana/

---->>>>Switched to the first window, Teuthworker user of teuthology node to observe how teuthology-worker is going to take the jobs from suite to execute across targets

teuthworker@teuthology-node:~$ worker_start plana 1
plana 1
Starting 1 workers for plana
teuthworker@teuthology-node:~$ 2015-08-20 23:47:13,344.344 INFO:teuthology.repo_utils:Fetching from upst
ream into /home/teuthworker/src/teuthology_master
2015-08-20 23:47:15,024.024 INFO:teuthology.repo_utils:Resetting repo at /home/teuthworker/src/teutholog
y_master to branch master
2015-08-20 23:47:15,032.032 INFO:teuthology.repo_utils:Bootstrapping /home/teuthworker/src/teuthology_ma
ster
INFO:teuthology.run_tasks:Running task internal.lock_machines...
INFO:teuthology.task.internal:Locking machines...
ERROR:teuthology.run_tasks:Saw exception from tasks
Traceback (most recent call last):
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/run_tasks.py", line 27, in run_tasks
manager.__enter__()
File "/usr/lib/python2.7/contextlib.py", line 17, in enter
return self.gen.next()
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/task/internal.py", line 56, in lock_machi
nes
machines = lock.list_locks(ctx)
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/lock.py", line 58, in list_locks
success, content, _ = ls.send_request('GET', ls._lock_url(ctx))
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/lockstatus.py", line 12, in send_request
resp, content = http.request(url, method=method, body=body, headers=headers)
File "/home/teuthworker/src/teuthology_cuttlefish/virtualenv/local/lib/python2.7/site-packages/httplib
2/__init__.py", line 1608, in request
(response, content) = self._request(conn, authority, uri, request_uri, method, body, headers, redire
ctions, cachekey)
File "/home/teuthworker/src/teuthology_cuttlefish/virtualenv/local/lib/python2.7/site-packages/httplib
2/__init__.py", line 1350, in request
(response, content) = self._conn_request(conn, request_uri, method, body, headers)
File "/home/teuthworker/src/teuthology_cuttlefish/virtualenv/local/lib/python2.7/site-packages/httplib
2/
_init__.py", line 1278, in _conn_request
raise ServerNotFoundError("Unable to find the server at %s" % conn.host)

ServerNotFoundError: Unable to find the server at paddlessiva.front.sepia.ceph.com

DEBUG:teuthology.run_tasks:Exception was not quenched, exiting: ServerNotFoundError: Unable to find the
server at paddlessiva.front.sepia.ceph.com
Traceback (most recent call last):
File "/home/teuthworker/src/teuthology_cuttlefish/virtualenv/bin/teuthology", line 9, in <module>
load_entry_point('teuthology==0.0.1', 'console_scripts', 'teuthology')()
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/run.py", line 186, in main
nuke(ctx, log, ctx.lock)
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/nuke.py", line 376, in nuke
for target, hostkey in ctx.config['targets'].iteritems():

KeyError: 'targets'

---------->>>>>>ISSUE1:-

--->>>>Teuthology-worker was not resolving the fqdn of paddles server. But paddles server Ip and FQDN are pinging in teuthology-node. It was given the following error as shown above.

1)ServerNotFoundError: Unable to find the server at paddlessiva.front.sepia.ceph.com

2)KeyError: 'targets'

---------->>>>>>Now i changed the FQDN of paddles server to its IP address in different places like below.---->>>>

1)In Teuthology-Node

ubuntu@teuthology-node:/etc$ sudo vi teuthology.yaml
(or)
ubuntu@teuthology-node:/etc$ sudo vi ~/.teuthology.yaml
#lab_domain: front.sepia.ceph.com
lock_server: http://192.168.2.106:8080
results_server: http://192.168.2.106:8080
results_ui_server: http://192.168.2.106:8081
queue_host: localhost
queue_port: 11300
results_email:
archive_base: /home/teuthworker/archive

teuthworker@teuthology-node:-$sudo vi ~/.teuthology.yaml
#lab_domain: front.sepia.ceph.com
lock_server: http://192.168.2.106:8080
results_server: http://192.168.2.106:8080
results_ui_server: http://192.168.2.106:8081
queue_host: localhost
queue_port: 11300
results_email:
archive_base: /home/teuthworker/archive

2)In Paddles/pulpito machine

paddles@paddles1:~/paddles$ vi config.py
server = {
'port': '8080',
'host': '192.168.2.106'
}
address = 'http://192.168.2.106:8080'

pulpito@paddles1:~/pulpito$ vi prod.py

server = {
'port': 8081,
'host': '192.168.2.106'
}
paddles_address = 'http://192.168.2.106:8080'

-->>----------->>>>In the first window, started the teuthology-worker from teuthworker user of teuthology-node

teuthworker@teuthology-node:~/bin$ worker_start plana 1
plana 1
Starting 1 workers for plana
teuthworker@teuthology-node:~/bin$ 2015-08-24 12:45:30,485.485 INFO:teuthology.repo_utils:Fetching from upstream into /home/teuthworker/src/teuthol
ogy_master
2015-08-24 12:45:32,243.243 INFO:teuthology.repo_utils:Resetting repo at /home/teuthworker/src/teuthology_master to branch master
2015-08-24 12:45:32,282.282 INFO:teuthology.repo_utils:Bootstrapping /home/teuthworker/src/teuthology_master
2015-08-24 12:45:38,207.207 INFO:teuthology.repo_utils:Bootstrap exited with status 0
2015-08-24 12:45:38,208.208 INFO:teuthology.repo_utils:Cloning https://github.com/ceph/ceph-qa-suite.git master from upstream
2015-08-24 12:45:46,749.749 INFO:teuthology.repo_utils:Resetting repo at /home/teuthworker/src/ceph-qa-suite_master to branch master

--->>>In second window, started running teuthology-suite from teuthology-node

ubuntu@teuthology-node:~/teuthology$ ./virtualenv/bin/teuthology-suite -s hadoop -c master -e -f basic -t cuttlefish -m plana -k
distro -p 1000

2015-08-24 12:48:33,167.167 INFO:teuthology.suite:kernel sha1: distro
2015-08-24 12:48:35,173.173 INFO:teuthology.suite:ceph sha1: 8b6208377f65c6ad5df0d5d3d1ac93fa351aff39
2015-08-24 12:48:36,462.462 INFO:teuthology.suite:ceph version: v9.0.3-1359.g8b62083
2015-08-24 12:48:37,817.817 INFO:teuthology.suite:teuthology branch: cuttlefish
2015-08-24 12:48:38,811.811 INFO:teuthology.suite:ceph-qa-suite branch: master
2015-08-24 12:48:38,813.813 INFO:teuthology.repo_utils:Fetching from upstream into /home/ubuntu/src/ceph-qa-suite_master
2015-08-24 12:48:42,698.698 INFO:teuthology.repo_utils:Resetting repo at /home/ubuntu/src/ceph-qa-suite_master to branch master
2015-08-24 12:48:42,752.752 WARNING:teuthology.suite:No machines found with machine_type plana!
2015-08-24 12:48:42,753.753 INFO:teuthology.suite:Suite hadoop in /home/ubuntu/src/ceph-qa-suite_master/suites/hadoop generated 3 jobs (not yet fil
tered)
2015-08-24 12:48:42,767.767 INFO:teuthology.suite:Scheduling hadoop/basic/{clusters/fixed-3.yaml tasks/repl.yaml}
Job scheduled with name ubuntu-2015-08-24_12:48:33-hadoop-master-distro-basic-plana and ID 39
2015-08-24 12:48:43,849.849 INFO:teuthology.suite:Scheduling hadoop/basic/{clusters/fixed-3.yaml tasks/terasort.yaml}
Job scheduled with name ubuntu-2015-08-24_12:48:33-hadoop-master-distro-basic-plana and ID 40
2015-08-24 12:48:44,916.916 INFO:teuthology.suite:Scheduling hadoop/basic/{clusters/fixed-3.yaml tasks/wordcount.yaml}
Job scheduled with name ubuntu-2015-08-24_12:48:33-hadoop-master-distro-basic-plana and ID 41
2015-08-24 12:48:46,005.005 INFO:teuthology.suite:Suite hadoop in /home/ubuntu/src/ceph-qa-suite_master/suites/hadoop scheduled 3 jobs.
2015-08-24 12:48:46,006.006 INFO:teuthology.suite:Suite hadoop in /home/ubuntu/src/ceph-qa-suite_master/suites/hadoop -- 0 jobs were filtered out.
Job scheduled with name ubuntu-2015-08-24_12:48:33-hadoop-master-distro-basic-plana and ID 42

2015-08-24 12:48:47,081.081 INFO:teuthology.suite:Test results viewable at http://192.168.2.106:8081/ubuntu-2015-08-24_12:48:33-hadoop-master-distr
o-basic-plana/

---->>>>Switched to the first window, Teuthworker user of teuthology node to observe how teuthology-worker is going to take the jobs from suite to execute across targets

teuthworker@teuthology-node:~/bin$ worker_start plana 1
plana 1
Starting 1 workers for plana
teuthworker@teuthology-node:~/bin$ 2015-08-24 12:45:30,485.485 INFO:teuthology.repo_utils:Fetching from upstream into /home/teuthworker/src/teuthol
ogy_master
2015-08-24 12:45:32,243.243 INFO:teuthology.repo_utils:Resetting repo at /home/teuthworker/src/teuthology_master to branch master
2015-08-24 12:45:32,282.282 INFO:teuthology.repo_utils:Bootstrapping /home/teuthworker/src/teuthology_master
2015-08-24 12:45:38,207.207 INFO:teuthology.repo_utils:Bootstrap exited with status 0
2015-08-24 12:45:38,208.208 INFO:teuthology.repo_utils:Cloning https://github.com/ceph/ceph-qa-suite.git master from upstream
2015-08-24 12:45:46,749.749 INFO:teuthology.repo_utils:Resetting repo at /home/teuthworker/src/ceph-qa-suite_master to branch master

2015-08-24 12:48:44,840.840 INFO:teuthology.worker:Reserved job 40
2015-08-24 12:48:44,840.840 INFO:teuthology.worker:Config is: branch: master
description: hadoop/basic/{clusters/fixed-3.yaml tasks/terasort.yaml}
email:
kernel: {kdb: true, sha1: distro}
last_in_suite: false
machine_type: plana
name: ubuntu-2015-08-24_12:48:33-hadoop-master-distro-basic-plana
nuke-on-error: true
overrides:
admin_socket: {branch: master}
ceph:
conf:
mon: {debug mon: 20, debug ms: 1, debug paxos: 20}
osd: {debug filestore: 20, debug journal: 20, debug ms: 1, debug osd: 20}
log-whitelist: [slow request]
sha1: 8b6208377f65c6ad5df0d5d3d1ac93fa351aff39
ceph-deploy:
branch: {dev: master}
conf:
client: {log file: /var/log/ceph/ceph-$name.$pid.log}
mon: {debug mon: 1, debug ms: 20, debug paxos: 20, osd default pool size: 2}
install:
ceph: {sha1: 8b6208377f65c6ad5df0d5d3d1ac93fa351aff39}
workunit: {sha1: 8b6208377f65c6ad5df0d5d3d1ac93fa351aff39}
owner: scheduled_ubuntu@teuthology-node
priority: 1000
roles:
- [mon.0, mds.0, osd.0, hadoop.master.0]
- [mon.1, osd.1, hadoop.slave.0]
- [mon.2, hadoop.slave.1, client.0]
sha1: 8b6208377f65c6ad5df0d5d3d1ac93fa351aff39
suite: hadoop
suite_branch: master
tasks:
- {ansible.cephlab: null}
- {clock.check: null}
- {ssh_keys: null}
- {install: null}
- {ceph: null}
- {hadoop: null}
- workunit:
clients:
client.0: [hadoop/terasort.sh]
env: {NUM_RECORDS: '10000000'}
teuthology_branch: cuttlefish
tube: plana
verbose: false

worker_log: /home/teuthworker/archive/worker_logs/worker.plana.9539
INFO:teuthology.run_tasks:Running task internal.lock_machines...
INFO:teuthology.task.internal:Locking machines...
ERROR:teuthology.run_tasks:Saw exception from tasks
Traceback (most recent call last):
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/run_tasks.py", line 27, in run_tasks
manager.__enter__()
File "/usr/lib/python2.7/contextlib.py", line 17, in enter
return self.gen.next()
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/task/internal.py", line 64, in lock_machines
num_up = len(filter(lambda machine: machine['up'] and machine['type'] machine_type, machines))
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/task/internal.py", line 64, in &lt;lambda&gt;
num_up = len(filter(lambda machine: machine['up'] and machine['type'] machine_type, machines))

TypeError: string indices must be integers
DEBUG:teuthology.run_tasks:Exception was not quenched, exiting: TypeError: string indices must be integers
Traceback (most recent call last):
File "/home/teuthworker/src/teuthology_cuttlefish/virtualenv/bin/teuthology", line 9, in <module>
load_entry_point('teuthology==0.0.1', 'console_scripts', 'teuthology')()
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/run.py", line 186, in main
nuke(ctx, log, ctx.lock)
File "/home/teuthworker/src/teuthology_cuttlefish/teuthology/nuke.py", line 376, in nuke
for target, hostkey in ctx.config['targets'].iteritems():

KeyError: 'targets'
----->>>>>ISSUE2:-
1)It was giving the  TypeError: string indices must be integers. This time it was resolving the paddles server.
2)KeyError: 'targets'

History

#1 Updated by John Spray over 8 years ago

  • Project changed from CephFS to teuthology

Switching project to something more relevant.

#2 Updated by Ian Colle over 7 years ago

  • Status changed from New to Closed
  • Assignee deleted (sivaram ravipati)

Also available in: Atom PDF