Tasks #6641: Make teuthology run on an existing cluster - teuthology - Ceph

Traceback (most recent call last):
  File "/Users/zack/inkdev/teuthology/teuthology/run_tasks.py", line 31, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/Users/zack/inkdev/teuthology/teuthology/run_tasks.py", line 19, in run_one_task
    return fn(**kwargs)
  File "/Users/zack/inkdev/teuthology/teuthology/task/internal.py", line 229, in check_ceph_data
    raise RuntimeError('Stale /var/lib/ceph detected, aborting.')
RuntimeError: Stale /var/lib/ceph detected, aborting.
DEBUG:teuthology.run_tasks:Exception was not quenched, exiting: RuntimeError: Stale /var/lib/ceph detected, aborting.
INFO:teuthology.run:Summary data:
{failure_reason: 'Stale /var/lib/ceph detected, aborting.', owner: zack@zmba.local,
  success: false}

teuthology.run.main() unconditionally adds the 'internal.check_ceph_data' task to every job.
teuthology.task.internal.check_ceph_data() forces the test to bail if /var/lib/ceph/ exists.
It looks like this will require some (hopefully minimal) changes to teuthology after all.

Actions

Copy link

#5

Updated by Zack Cerza over 10 years ago

I'm thinking of adding a use_existing_cluster: true option (or similar) that can be used in job yaml files. That way we can ignore an existing /var/lib/ceph and potentially take (or not take) any other actions we need to when using an existing cluster.

Actions

Copy link

#6

Updated by Zack Cerza over 10 years ago

This fixes the previously-mentioned problem:

https://github.com/ceph/teuthology/commit/f8150d44d0af383510ab7c7bccde07d78a6a3fef

Actions

Copy link

#7

Updated by Zack Cerza over 10 years ago

I had to manually install ceph-test to get the ceph-coverage binary installed. Pretty much everything teuthology does tries to use that.

Now I just need to find a test that can be run on an existing cluster.

Actions

Copy link

#8

Updated by Zack Cerza over 10 years ago

Another roadblock:
teuthology.task.ceph.task() is causing teuthology.task.ceph.cluster() (which creates the cluster) to be run. The ceph task is required because it is what copies scripts like adjust-ulimits to the nodes. I made this change to skip cluster():
https://github.com/ceph/teuthology/commit/d04f3a6ae09224c4bf2d03fb058807b5a6cbf666

But I am seeing this now:

INFO:teuthology.task.ceph:Starting mds daemons...
INFO:teuthology.task.ceph.mds.0:Restarting
INFO:teuthology.task.ceph.osd.1.out:[10.214.136.34]: starting osd.1 at :/0 osd_data /var/lib/ceph/osd/ceph-1 /var/lib/ceph/osd/ceph-1/journal
INFO:teuthology.task.ceph.osd.1.err:[10.214.136.34]: 2013-11-21 11:45:35.599476 7f9097f96780 -1 journal FileJournal::_open: disabling aio for non-block journal.  Use journal_force_aio to force use of aio anyway
INFO:teuthology.task.ceph.mds.0:Started
INFO:teuthology.task.ceph.mds.0.out:[10.214.136.32]: starting mds.0 at :/0
INFO:teuthology.task.ceph.mds.0.err:[10.214.136.32]: 2013-11-21 11:45:35.719808 7fe0aceb8780 -1 mds.-1.0 ERROR: failed to authenticate: (1) Operation not permitted
INFO:teuthology.task.ceph.mds.0.err:[10.214.136.32]: *** Caught signal (Segmentation fault) **
INFO:teuthology.task.ceph.mds.0.err:[10.214.136.32]:  in thread 7fe0a5e44700
INFO:teuthology.orchestra.run.err:[10.214.136.32]: 2013-11-21 11:45:35.911517 7f70fe9e6700  0 librados: client.admin authentication error (1) Operation not permitted
INFO:teuthology.orchestra.run.err:[10.214.136.32]: Error connecting to cluster: PermissionError
ERROR:teuthology.contextutil:Saw exception from nested tasks
Traceback (most recent call last):
  File "/Users/zack/inkdev/teuthology/teuthology/contextutil.py", line 25, in nested
    vars.append(enter())
  File "/usr/local/Cellar/python/2.7.5/Frameworks/Python.framework/Versions/2.7/lib/python2.7/contextlib.py", line 17, in __enter__
    return self.gen.next()
  File "/Users/zack/inkdev/teuthology/teuthology/task/ceph.py", line 1098, in run_daemon
    'mds', 'set_max_mds', str(num_active)])
  File "/Users/zack/inkdev/teuthology/teuthology/orchestra/remote.py", line 47, in run
    r = self._runner(client=self.ssh, **kwargs)
  File "/Users/zack/inkdev/teuthology/teuthology/orchestra/run.py", line 271, in run
    r.exitstatus = _check_status(r.exitstatus)
  File "/Users/zack/inkdev/teuthology/teuthology/orchestra/run.py", line 267, in _check_status
    raise CommandFailedError(command=r.command, exitstatus=status, node=host)
CommandFailedError: Command failed on 10.214.136.32 with status 1: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph mds set_max_mds 1'

Actions

Copy link

#9

Updated by Ian Colle over 10 years ago

Target version changed from v0.72 Emperor to v0.73

Actions

Copy link

#10

Updated by Ian Colle over 10 years ago

Target version changed from v0.73 to v0.74

Actions

Copy link

#11

Updated by Tamilarasi muthamizhan over 10 years ago

Target version changed from v0.74 to v0.75

Actions

Copy link

#12

Updated by Zack Cerza over 10 years ago

Target version changed from v0.75 to v0.76

Actions

Copy link

#13

Updated by Zack Cerza over 10 years ago

Target version deleted (~~v0.76~~)

Actions

Copy link

#14

Updated by Ian Colle about 10 years ago

Subject changed from Attempt to run teuthology on an existing cluster to Make teuthology run on an existing cluster
Target version set to sprint5

Actions

Copy link

#15

Updated by Zack Cerza almost 10 years ago

Status changed from In Progress to Closed

I'm going to close this because it was always too open-ended. I will continue to investigate and open bugs for issues I find.

Project

General

Profile

Tools » teuthology

Custom queries

Tasks #6641

Make teuthology run on an existing cluster

Updated by Zack Cerza over 10 years ago

Updated by Zack Cerza over 10 years ago

Updated by Tamilarasi muthamizhan over 10 years ago

Updated by Zack Cerza over 10 years ago

Updated by Zack Cerza over 10 years ago

Updated by Zack Cerza over 10 years ago

Updated by Zack Cerza over 10 years ago

Updated by Zack Cerza over 10 years ago

Updated by Ian Colle over 10 years ago

Updated by Ian Colle over 10 years ago

Updated by Tamilarasi muthamizhan over 10 years ago

Updated by Zack Cerza over 10 years ago

Updated by Zack Cerza over 10 years ago

Updated by Ian Colle about 10 years ago

Updated by Zack Cerza almost 10 years ago