Project

General

Profile

Actions

Bug #10470

closed

ENOENT starting ceph_test_rados on upgrade tests

Added by Sage Weil over 9 years ago. Updated over 9 years ago.

Status:
Resolved
Priority:
Urgent
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

2015-01-06T16:20:57.316 INFO:teuthology.orchestra.run.vpm140:Running: 'wget -q -O- http://gitbuilder.ceph.com/ceph-deb-trusty-x86_64-basic/sha1/d5e2ca1620412a5df38716d3ee78617197345fae/version'
2015-01-06T16:20:57.361 INFO:tasks.rados.rados.0.vpm112.stdout:adding op weight read -> 100
2015-01-06T16:20:57.362 INFO:tasks.rados.rados.0.vpm112.stdout:adding op weight write -> 100
2015-01-06T16:20:57.362 INFO:tasks.rados.rados.0.vpm112.stdout:adding op weight delete -> 50
2015-01-06T16:20:57.362 INFO:tasks.rados.rados.0.vpm112.stdout:adding op weight snap_create -> 50
2015-01-06T16:20:57.362 INFO:tasks.rados.rados.0.vpm112.stdout:adding op weight snap_remove -> 50
2015-01-06T16:20:57.363 INFO:tasks.rados.rados.0.vpm112.stdout:adding op weight rollback -> 50
2015-01-06T16:20:57.363 INFO:tasks.rados.rados.0.vpm112.stdout:adding op weight copy_from -> 50
2015-01-06T16:20:57.383 INFO:tasks.rados.rados.0.vpm112.stdout:ceph version 0.80.7-161-ge0648e3 (e0648e3d30de504b096c4ae3bbe7d9c17652bdb5)
2015-01-06T16:20:57.383 INFO:tasks.rados.rados.0.vpm112.stdout:Configuration:
2015-01-06T16:20:57.383 INFO:tasks.rados.rados.0.vpm112.stdout: Number of operations: 4000
2015-01-06T16:20:57.383 INFO:tasks.rados.rados.0.vpm112.stdout: Number of objects: 500
2015-01-06T16:20:57.384 INFO:tasks.rados.rados.0.vpm112.stdout: Max in flight operations: 16
2015-01-06T16:20:57.384 INFO:tasks.rados.rados.0.vpm112.stdout: Object size (in bytes): 4000000
2015-01-06T16:20:57.384 INFO:tasks.rados.rados.0.vpm112.stdout: Write stride min: 400000
2015-01-06T16:20:57.384 INFO:tasks.rados.rados.0.vpm112.stdout: Write stride max: 800000
2015-01-06T16:20:57.401 INFO:tasks.rados:starting run 0 out of 1
2015-01-06T16:20:57.401 INFO:tasks.ceph.ceph_manager:creating pool_name unique_pool_0
2015-01-06T16:20:57.402 INFO:teuthology.orchestra.run.vpm160:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd pool create unique_pool_0 16 16 erasure teuthologyprofile'
2015-01-06T16:20:57.455 INFO:tasks.rados.rados.0.vpm112.stderr:Error initializing rados test context: (2) No such file or directory
2015-01-06T16:20:57.458 ERROR:teuthology.parallel:Exception in parallel execution
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 82, in __exit__
    for result in self:
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 101, in next
    resurrect_traceback(result)
  File "/home/teuthworker/src/teuthology_master/teuthology/parallel.py", line 19, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/src/teuthology_master/teuthology/task/parallel.py", line 56, in _run_spawned
    mgr.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/var/lib/teuthworker/src/ceph-qa-suite_wip-10233/tasks/rados.py", line 190, in task
    running.get()
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 331, in get
    raise self._exception
CommandFailedError: Command failed on vpm112 with status 1: 'CEPH_CLIENT_ID=0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph_test_rados --pool-snaps --op read 100 --op write 100 --op delete 50 --max-ops 4000 --objects 500 --max-in-flight 16 --size 4000000 --min-stride-size 400000 --max-stride-size 800000 --max-seconds 0 --op snap_create 50 --op snap_remove 50 --op rollback 50 --op copy_from 50 --pool base'

ubuntu@teuthology:/a/teuthology-2015-01-06_16:00:15-upgrade:firefly-x-next-distro-basic-vps/688536
and a half dozen others in that run
Actions #1

Updated by Yuri Weinstein over 9 years ago

  • Assignee set to Yuri Weinstein

That could be related to a new test I added, will double check.

Actions #2

Updated by Samuel Just over 9 years ago

- print: '**** done rados/load-gen-big.sh 2-workload parallel'
- exec:
client.0:
- ceph osd pool create base 4
- ceph osd pool create cache 4
- ceph osd tier add base cache
- ceph osd tier cache-mode cache writeback
- ceph osd tier set-overlay base cache
- ceph osd pool set cache hit_set_type bloom
- ceph osd pool set cache hit_set_count 8
- ceph osd pool set cache hit_set_period 3600
- ceph osd pool set cache target_max_objects 250
- rados:
clients:
- client.0
objects: 500
op_weights:
copy_from: 50
delete: 50
evict: 50
flush: 50
read: 100
rollback: 50
snap_create: 50
snap_remove: 50
try_flush: 50
write: 100
ops: 4000
pool_snaps: true
pools:
- base

This part is under parallel. The exec must complete first, or the rados part will fail. However, since they are parallel, some of the time the rados part is starting before the exec sequence finishes resulting in the above problem.

Actions #3

Updated by Yuri Weinstein over 9 years ago

I think it's fixed, will double check

Actions #5

Updated by Yuri Weinstein over 9 years ago

  • Project changed from Ceph to teuthology
  • Status changed from New to Resolved
  • Target version set to sprint22
Actions

Also available in: Atom PDF