Project

General

Profile

Actions

Bug #7584

closed

"[ FAILED ] LibRadosAio.OmapPP" in upgrade:dumpling-x-firefly---basic-plana

Added by Yuri Weinstein about 10 years ago. Updated about 10 years ago.

Status:
Resolved
Priority:
Urgent
Category:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
2 - major
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Logs are in qa-proxy.ceph.com/teuthology/teuthology-2014-03-01_22:00:03-upgrade:dumpling-x-firefly---basic-plana/113447/

2014-03-01T23:24:47.554 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: test/librados/aio.cc:1191: Failure
2014-03-01T23:24:47.555 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: Value of: r
2014-03-01T23:24:47.555 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]:   Actual: -125
2014-03-01T23:24:47.555 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: Expected: 0
2014-03-01T23:24:47.556 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: [  FAILED  ] LibRadosAio.OmapPP (2665 ms)
2014-03-01T23:24:47.556 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: [----------] 29 tests from LibRadosAio (108642 ms total)
2014-03-01T23:24:47.556 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: 
2014-03-01T23:24:47.556 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: [----------] Global test environment tear-down
2014-03-01T23:24:47.557 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: [==========] 29 tests from 1 test case ran. (108642 ms total)
2014-03-01T23:24:47.557 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: [  PASSED  ] 28 tests.
2014-03-01T23:24:47.557 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: [  FAILED  ] 1 test, listed below:
2014-03-01T23:24:47.557 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: [  FAILED  ] LibRadosAio.OmapPP
2014-03-01T23:24:47.557 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]: 
2014-03-01T23:24:47.557 INFO:teuthology.task.workunit.client.0.out:[10.214.133.20]:  1 FAILED TEST
2014-03-01T23:24:47.558 INFO:teuthology.task.workunit:Stopping rados/test.sh on client.0...
2014-03-01T23:24:47.558 DEBUG:teuthology.orchestra.run:Running [10.214.133.20]: 'rm -rf -- /home/ubuntu/cephtest/workunits.list /home/ubuntu/cephtest/workunit.client.0'
2014-03-01T23:24:47.566 ERROR:teuthology.parallel:Exception in parallel execution
Traceback (most recent call last):
  File "/home/teuthworker/teuthology-firefly/teuthology/parallel.py", line 82, in __exit__
    for result in self:
  File "/home/teuthworker/teuthology-firefly/teuthology/parallel.py", line 101, in next
    resurrect_traceback(result)
  File "/home/teuthworker/teuthology-firefly/teuthology/parallel.py", line 19, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/teuthology-firefly/teuthology/task/workunit.py", line 345, in _run_tests
    args=args,
  File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/remote.py", line 106, in run
    r = self._runner(client=self.ssh, **kwargs)
  File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/run.py", line 328, in run
    r.exitstatus = _check_status(r.exitstatus)
  File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/run.py", line 324, in _check_status
    raise CommandFailedError(command=r.command, exitstatus=status, node=host)
CommandFailedError: Command failed on 10.214.133.20 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=dumpling TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage /home/ubuntu/cephtest/workunit.client.0/rados/test.sh'
2014-03-01T23:24:47.567 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/teuthology-firefly/teuthology/run_tasks.py", line 31, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/teuthology-firefly/teuthology/run_tasks.py", line 19, in run_one_task
    return fn(**kwargs)
  File "/home/teuthworker/teuthology-firefly/teuthology/task/workunit.py", line 96, in task
    all_spec = True
  File "/home/teuthworker/teuthology-firefly/teuthology/parallel.py", line 82, in __exit__
    for result in self:
  File "/home/teuthworker/teuthology-firefly/teuthology/parallel.py", line 101, in next
    resurrect_traceback(result)
  File "/home/teuthworker/teuthology-firefly/teuthology/parallel.py", line 19, in capture_traceback
    return func(*args, **kwargs)
  File "/home/teuthworker/teuthology-firefly/teuthology/task/workunit.py", line 345, in _run_tests
    args=args,
  File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/remote.py", line 106, in run
    r = self._runner(client=self.ssh, **kwargs)
  File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/run.py", line 328, in run
    r.exitstatus = _check_status(r.exitstatus)
  File "/home/teuthworker/teuthology-firefly/teuthology/orchestra/run.py", line 324, in _check_status
    raise CommandFailedError(command=r.command, exitstatus=status, node=host)
CommandFailedError: Command failed on 10.214.133.20 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=dumpling TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage /home/ubuntu/cephtest/workunit.client.0/rados/test.sh'
2014-03-01T23:24:47.599 ERROR:teuthology.run_tasks: Sentry event: http://sentry.ceph.com/inktank/teuthology/search?q=7c2ddda46c7d451b8fad04a3fe17e527
CommandFailedError: Command failed on 10.214.133.20 with status 1: 'mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1 CEPH_REF=dumpling TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage /home/ubuntu/cephtest/workunit.client.0/rados/test.sh'
2014-03-01T23:24:47.599 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
2014-03-01T23:24:47.599 DEBUG:teuthology.run_tasks:Unwinding manager rados
2014-03-01T23:24:47.599 INFO:teuthology.task.rados:joining rados
2014-03-01T23:24:47.599 ERROR:teuthology.run_tasks:Manager failed: rados
Traceback (most recent call last):
  File "/home/teuthworker/teuthology-firefly/teuthology/run_tasks.py", line 84, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/teuthworker/teuthology-firefly/teuthology/task/rados.py", line 175, in task
    running.get()
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get
    raise self._exception
CommandFailedError: Command failed on 10.214.133.20 with status 1: 'CEPH_CLIENT_ID=0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph_test_rados --op read 45 --op write 45 --op delete 10 --op snap_create 0 --op snap_remove 0 --op rollback 0 --op setattr 0 --op rmattr 0 --op watch 0 --op append 0 --max-ops 4000 --objects 500 --max-in-flight 16 --size 4000000 --min-stride-size 400000 --max-stride-size 800000 --max-seconds 0 --pool unique_pool_0'
2014-03-01T23:24:47.600 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
2014-03-01T23:24:47.600 DEBUG:teuthology.run_tasks:Unwinding manager thrashosds
2014-03-01T23:24:47.600 INFO:teuthology.task.thrashosds:joining thrashosds
2014-03-01T23:24:47.600 ERROR:teuthology.run_tasks:Manager failed: thrashosds
Traceback (most recent call last):
  File "/home/teuthworker/teuthology-firefly/teuthology/run_tasks.py", line 84, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 35, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/teuthworker/teuthology-firefly/teuthology/task/thrashosds.py", line 172, in task
    thrash_proc.do_join()
  File "/home/teuthworker/teuthology-firefly/teuthology/task/ceph_manager.py", line 153, in do_join
    self.thread.get()
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get
    raise self._exception
CommandFailedError: Command failed on 10.214.133.22 with status 22: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd primary-affinity 3 1'
2014-03-01T23:24:47.601 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
archive_path: /var/lib/teuthworker/archive/teuthology-2014-03-01_22:00:03-upgrade:dumpling-x-firefly---basic-plana/113447
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml
  2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/readwrite.yaml
  6-next-mon/monb.yaml 7-workload/rados_api_tests.yaml 8-next-mon/monc.yaml 9-workload/rbd-python.yaml
  distros/ubuntu_12.04.yaml}
email: null
job_id: '113447'
last_in_suite: false
machine_type: plana
name: teuthology-2014-03-01_22:00:03-upgrade:dumpling-x-firefly---basic-plana
nuke-on-error: true
os_type: ubuntu
os_version: '12.04'
overrides:
  admin_socket:
    branch: firefly
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
      osd:
        debug filestore: 20
        debug ms: 1
        debug osd: 20
    log-whitelist:
    - slow request
    - wrongly marked me down
    - objects unfound and apparently lost
    - log bound mismatch
    sha1: 32a4e903497bcf40a50e5e683c7bda17eccd11b3
  ceph-deploy:
    branch:
      dev: firefly
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
        osd default pool size: 2
  install:
    ceph:
      sha1: 32a4e903497bcf40a50e5e683c7bda17eccd11b3
  s3tests:
    branch: master
  workunit:
    sha1: 32a4e903497bcf40a50e5e683c7bda17eccd11b3
owner: scheduled_teuthology@teuthology
roles:
- - mon.a
  - mon.b
  - mds.a
  - osd.0
  - osd.1
  - osd.2
- - osd.3
  - osd.4
  - osd.5
  - mon.c
- - client.0
targets:
  ubuntu@plana94.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDDtBFZH1vJ4VyJt3ReE2X11UmD6sLcvvSk5jWtbbmTKwHPECESeIH7se+2Hm2jNoqzgvGCI6IHTgqJIRXLNJL/Z2QAC6w5cHJhVgbqI4ksZ8Z3r0QEfuJzmO4eMQccfhjmuNRSm+UwNtUg2RgH0qXvHchxuvBSgyBpQ3MxsdBCwpRvUbg17YzGGe41H/S0ObmSQecWoPoYa5VDoFNR8aURu0fV9KBYJ3b57SL3fieF/IEYbcIT1lmml7bdfEUTWrh3HmnXq6/qMaaINPcuMYERNTbRmJg/by7c0DC3qViDXFXaQCif7mXvk0+S7I3pV3GwcqxPrmmuJJGFJ08MDxur
  ubuntu@plana95.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQD9jrEw6Cgt5RW/5K4cnLr5haf0lm9dUtTQB+01+/8GYtz1sbeHDJhlxD5cMGPD5PzfGGgfhWs62Ocs6jGPB8KiaK4jKihJrz1yhs1X7RoUgJe5X/Jm3SQOUQ8KAE2HPI3G29fk2SjSoukGRY94fF4RaRNOhW0mvntmYyIKELgLkBxzDcBYKbjZouc57vXbcJV16tds5eyCKfd9Xfos2v+NKah+IdcS0aR+XeyZHjfXthE8kJfawXDozC1mSB4XnfMZlLnpY/KcNVy3/BdaLckjLOBOvekokeJFCGGt1Q6tMKMRx9Mgth0N/qjU6S2yd9aERR8fn9frZiBi2EkV70hx
  ubuntu@plana96.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDLQgA4t8V9O8Lq/fLpceWAwfZZfY6x6V6S2YjP0a9j+Nvw4V8w7frXw6GAFDNaGBV8MG0mYAgRNhn21ILe+86X98NrDjE1gwfGEvRqnguO9Z8JurKicUdLs+ZjcB7pi78YAzY1SBVY7nWjAF9RU3b4SYUN40SZ2AOa/PgqhMCB6SMvatfC53rRH1BktCkBGmFbQrjfv77oEQmxRtQs3QWUyJ5o/L73G6sfdhf4SkfkBrKWF1u75dd6S14LtRkD9iHkDRxriznvepVxVwzoPFYbjR5jLMuaa9N5xIO1V3PYNkGBz2yCgpEj8xahnd1miYZ4qTMs7uoqXODL9Nm2dC+1
tasks:
- internal.lock_machines:
  - 3
  - plana
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- internal.check_ceph_data: null
- internal.vm_setup: null
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.sudo: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock.check: null
- install:
    branch: dumpling
- ceph:
    fs: xfs
- install.upgrade:
    osd.0: null
- ceph.restart:
    daemons:
    - osd.0
    - osd.1
    - osd.2
- thrashosds:
    chance_pgnum_grow: 1
    chance_pgpnum_fix: 1
    timeout: 1200
- ceph.restart:
    daemons:
    - mon.a
    wait-for-healthy: false
    wait-for-osds-up: true
- rados:
    clients:
    - client.0
    objects: 500
    op_weights:
      delete: 10
      read: 45
      write: 45
    ops: 4000
- ceph.restart:
    daemons:
    - mon.b
    wait-for-healthy: false
    wait-for-osds-up: true
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rados/test.sh
- install.upgrade:
    mon.c: null
- ceph.restart:
    daemons:
    - mon.c
    wait-for-healthy: false
    wait-for-osds-up: true
- ceph.wait_for_mon_quorum:
  - a
  - b
  - c
- workunit:
    clients:
      client.0:
      - rbd/test_librbd_python.sh
teuthology_branch: firefly
verbose: true
worker_log: /var/lib/teuthworker/archive/worker_logs/worker.plana.11189
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml
  2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/readwrite.yaml
  6-next-mon/monb.yaml 7-workload/rados_api_tests.yaml 8-next-mon/monc.yaml 9-workload/rbd-python.yaml
  distros/ubuntu_12.04.yaml}
duration: 497.6519410610199
failure_reason: 'Command failed on 10.214.133.20 with status 1: ''mkdir -p -- /home/ubuntu/cephtest/mnt.0/client.0/tmp
  && cd -- /home/ubuntu/cephtest/mnt.0/client.0/tmp && CEPH_CLI_TEST_DUP_COMMAND=1
  CEPH_REF=dumpling TESTDIR="/home/ubuntu/cephtest" CEPH_ID="0" adjust-ulimits ceph-coverage
  /home/ubuntu/cephtest/archive/coverage /home/ubuntu/cephtest/workunit.client.0/rados/test.sh'''
flavor: basic
owner: scheduled_teuthology@teuthology
sentry_event: http://sentry.ceph.com/inktank/teuthology/search?q=7c2ddda46c7d451b8fad04a3fe17e527
success: false
Actions #1

Updated by Yuri Weinstein about 10 years ago

  • Project changed from teuthology to 15
Actions #2

Updated by Sage Weil about 10 years ago

again, running new workunit against old cluster.

or possibly, because the new client utils (ceph_test_rados_api_*) from ceph-test are installed but the running daemons are old.

fix this by not upgrading the packages on the client node.

Actions #3

Updated by Ian Colle about 10 years ago

  • Project changed from 15 to teuthology
  • Assignee set to Yuri Weinstein
Actions #4

Updated by Ian Colle about 10 years ago

  • Priority changed from Normal to Urgent
Actions #6

Updated by Yuri Weinstein about 10 years ago

  • Status changed from New to Resolved
Actions

Also available in: Atom PDF