Project

General

Profile

Actions

Bug #8114

closed

"osd/RadosModel.h: 1055: FAILED assert" in upgrade:dumpling-x:stress-split-firefly-distro-basic-vps suite

Added by Yuri Weinstein about 10 years ago. Updated almost 10 years ago.

Status:
Can't reproduce
Priority:
Urgent
Category:
-
Target version:
-
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2014-04-13_22:35:20-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps/189643/

2014-04-14T04:49:01.115 INFO:teuthology.task.rados.rados.0.out:[10.214.138.114]: Writing vpm053.front.sepia.ceph.com2948-280 from 785589 to 1185589 tid 2 ranges are [0~76,785589~400000]
2014-04-14T04:49:01.392 INFO:teuthology.task.rados.rados.0.out:[10.214.138.114]: 2356: oids not in use 486
2014-04-14T04:49:01.392 INFO:teuthology.task.rados.rados.0.out:[10.214.138.114]: Reading 383
2014-04-14T04:49:01.392 INFO:teuthology.task.rados.rados.0.out:[10.214.138.114]: 2357: oids not in use 485
2014-04-14T04:49:01.392 INFO:teuthology.task.rados.rados.0.out:[10.214.138.114]: Reading 449
2014-04-14T04:49:01.393 INFO:teuthology.task.rados.rados.0.out:[10.214.138.114]: Waiting on 16
2014-04-14T04:49:01.444 INFO:teuthology.task.rados.rados.0.out:[10.214.138.114]: incorrect buffer at pos 449324
2014-04-14T04:49:01.444 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]: Object 130 contents ObjNum: 1363 snap: 0 seqnum: 1363 prefix: vpm053.front.sepia.ceph.com2948-OID: 130 snap 0
2014-04-14T04:49:01.444 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]:  corrupt
2014-04-14T04:49:01.462 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]: ./test/osd/RadosModel.h: In function 'virtual void ReadOp::_finish(TestOp::CallbackInfo*)' thread 7f4cf5040700 time 2014-04-14 07:49:01.443491
2014-04-14T04:49:01.462 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]: ./test/osd/RadosModel.h: 1055: FAILED assert(0)
2014-04-14T04:49:01.462 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]:  ceph version 0.67.7-68-g06f27fc (06f27fc6446d47b853208357ec4277c5dc10d9fe)
2014-04-14T04:49:01.463 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]:  1: (ReadOp::_finish(TestOp::CallbackInfo*)+0x133f) [0x412f1f]
2014-04-14T04:49:01.463 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]:  2: (librados::C_AioComplete::finish(int)+0x18) [0x7f4cfd9049a8]
2014-04-14T04:49:01.463 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]:  3: (Context::complete(int)+0xa) [0x7f4cfd8e289a]
2014-04-14T04:49:01.463 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]:  4: (Finisher::finisher_thread_entry()+0x1d8) [0x7f4cfd971e38]
2014-04-14T04:49:01.463 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]:  5: (()+0x7851) [0x7f4cfd559851]
2014-04-14T04:49:01.464 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]:  6: (clone()+0x6d) [0x7f4cfcd8b90d]
2014-04-14T04:49:01.464 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]:  NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2014-04-14T04:49:01.464 INFO:teuthology.task.rados.rados.0.err:[10.214.138.114]: terminate called after throwing an instance of 'ceph::FailedAssertion'
2014-04-14T04:49:03.054 INFO:teuthology.orchestra.run.err:[10.214.139.91]: dumped all in format json
2014-04-14T07:27:03.051 DEBUG:teuthology.orchestra.run:Running [10.214.138.114]: 'rmdir /home/ubuntu/cephtest/apache'
2014-04-14T07:27:03.141 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
2014-04-14T07:27:03.141 DEBUG:teuthology.run_tasks:Unwinding manager install.upgrade
2014-04-14T07:27:03.141 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
2014-04-14T07:27:03.141 DEBUG:teuthology.run_tasks:Unwinding manager rados
2014-04-14T07:27:03.142 INFO:teuthology.task.rados:joining rados
2014-04-14T07:27:03.142 ERROR:teuthology.run_tasks:Manager failed: rados
Traceback (most recent call last):
  File "/home/teuthworker/teuthology-firefly/teuthology/run_tasks.py", line 92, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/teuthworker/teuthology-firefly/teuthology/task/rados.py", line 170, in task
    running.get()
  File "/usr/lib/python2.7/dist-packages/gevent/greenlet.py", line 308, in get
    raise self._exception
CommandCrashedError: Command crashed: 'CEPH_CLIENT_ID=0 adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph_test_rados --op read 45 --op write 45 --op delete 10 --max-ops 4000 --objects 500 --max-in-flight 16 --size 4000000 --min-stride-size 400000 --max-stride-size 800000 --max-seconds 0 --pool unique_pool_0'
2014-04-14T07:27:03.143 DEBUG:teuthology.run_tasks:Unwinding manager ceph.restart
2014-04-14T07:27:03.143 DEBUG:teuthology.run_tasks:Unwinding manager thrashosds
2014-04-14T07:27:03.143 INFO:teuthology.task.thrashosds:joining thrashosds
2014-04-14T07:27:03.637 DEBUG:teuthology.orchestra.run:Running [10.214.139.91]: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph pg dump --format=json'
archive_path: /var/lib/teuthworker/archive/teuthology-2014-04-13_22:35:20-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps/189643
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml
  2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/readwrite.yaml
  6-next-mon/monb.yaml 7-workload/rados_api_tests.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml
  rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/rhel_6.4.yaml}
email: null
job_id: '189643'
kernel: &id001
  kdb: true
  sha1: distro
last_in_suite: false
machine_type: vps
name: teuthology-2014-04-13_22:35:20-upgrade:dumpling-x:stress-split-firefly-distro-basic-vps
nuke-on-error: true
os_type: rhel
os_version: '6.4'
overrides:
  admin_socket:
    branch: firefly
  ceph:
    conf:
      mon:
        debug mon: 20
        debug ms: 1
        debug paxos: 20
        mon warn on legacy crush tunables: false
      osd:
        debug filestore: 20
        debug journal: 20
        debug ms: 1
        debug osd: 20
    log-whitelist:
    - slow request
    - wrongly marked me down
    - objects unfound and apparently lost
    - log bound mismatch
    sha1: d6c71b76241b6c5cd2ac5d812250d4bb044ac537
  ceph-deploy:
    branch:
      dev: firefly
    conf:
      client:
        log file: /var/log/ceph/ceph-$name.$pid.log
      mon:
        debug mon: 1
        debug ms: 20
        debug paxos: 20
        osd default pool size: 2
  install:
    ceph:
      sha1: d6c71b76241b6c5cd2ac5d812250d4bb044ac537
  s3tests:
    branch: master
  workunit:
    sha1: d6c71b76241b6c5cd2ac5d812250d4bb044ac537
owner: scheduled_teuthology@teuthology
roles:
- - mon.a
  - mon.b
  - mds.a
  - osd.0
  - osd.1
  - osd.2
- - osd.3
  - osd.4
  - osd.5
  - mon.c
- - client.0
targets:
  ubuntu@vpm042.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAuWN5Ijz5upGJiebI66ymbkzbE7WMb/2WVn/jeNqJYZ8MpKqsLzrlkLVY5HiAK6edyIMOhCEZIkHOBq63PDYyQo6gd1B/4ZrPYET5wbo6svpII7JSscKg1oZxNXTyN466nCYYzGqz7Utp7TMZ/AEKnBcaPs10o4OaCv3kaquPg8IigaKYeAxHEf7sgk1r2dGPcjF8/tV1L/kELpGpulPlBzaRq9WCU0EGkxeG3BEf0wxkLwxVJteJO3HUmeZKHEKVM189VL2AfdOtHt0D2gm5UcsVaWEwbLJJwrkqjq8S91miPjL3aMwhUmQNwKZTDXzk2I9UFQJLk/2X8U2eNjQyBw==
  ubuntu@vpm053.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEArSoAbpr6UULNS9e7hnKDZQKfKPcU1/6qIURZxWSjGqWrWytPoMfctfpwVKUTlTsThpXPv3ZKXJ1hqNOoKM2YBykix1WRWc7yQ+T+teWrjZ2oSKNnhpJk+rgn4bwPmq7xQ1sc2UwUvZtcfOeuCxLF6DzuJYgFdhbx3Eak1XCbT4DGkgTymmpn0ayuKeAzZ9Ha06LrR+bzUDzDm43980PoBEvQZ8iphKeY/VjbYlnZk49eRERt9XbSSs7BiINdc/knufMr9WICXPSbqqtDhCsl/mdraG2dWGcx18eC+HCDKgp/k+a8MmVibWdfWm0dnFMfKrlbRhbOVlhD//CvEXWjDQ==
  ubuntu@vpm060.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAABIwAAAQEAxpjDTNhFAbGQM8ZXCeolmWVJVUpLICNhjZ5oiaTRVUpQwXXkQo7zaSvsXjcE1dTFuEu04c/59T2hhZ70y5QsWVIVlR+JebsqddIYaHfqMuAjv7cKKWNhiCg30AXtAeB6plxT2kkxU3IPdvpPKm/RzeQcYGqYFyRyRjuubbab/qm2H7mtbQHSrwtc1cEJafZZyerVcXkyaQEm5Mw6n2UFv+XPaOFZe6QJirFEZ6FM55b1bEcEQoqQ9C23Uqxq9eIzmz4lDvkXcBWJU4J4OyZt5APZvOtWRq3Ri1S7d6G53jm8Vw+98s9iNVE5hzGGHTj8UMIoj3BgAc7PIDxam5DNbw==
tasks:
- internal.lock_machines:
  - 3
  - vps
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- internal.check_ceph_data: null
- internal.vm_setup: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.sudo: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock.check: null
- install:
    branch: dumpling
- ceph:
    fs: xfs
- install.upgrade:
    osd.0: null
- ceph.restart:
    daemons:
    - osd.0
    - osd.1
    - osd.2
- thrashosds:
    chance_pgnum_grow: 1
    chance_pgpnum_fix: 1
    thrash_primary_affinity: false
    timeout: 1200
- ceph.restart:
    daemons:
    - mon.a
    wait-for-healthy: false
    wait-for-osds-up: true
- rados:
    clients:
    - client.0
    objects: 500
    op_weights:
      delete: 10
      read: 45
      write: 45
    ops: 4000
- ceph.restart:
    daemons:
    - mon.b
    wait-for-healthy: false
    wait-for-osds-up: true
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rados/test-upgrade-firefly.sh
- install.upgrade:
    mon.c: null
- ceph.restart:
    daemons:
    - mon.c
    wait-for-healthy: false
    wait-for-osds-up: true
- ceph.wait_for_mon_quorum:
  - a
  - b
  - c
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rados/test-upgrade-firefly.sh
- workunit:
    branch: dumpling
    clients:
      client.0:
      - rbd/test_librbd_python.sh
- rgw:
    client.0:
      idle_timeout: 300
- swift:
    client.0:
      rgw_server: client.0
- rados:
    clients:
    - client.0
    objects: 500
    op_weights:
      delete: 50
      read: 100
      rollback: 50
      snap_create: 50
      snap_remove: 50
      write: 100
    ops: 4000
teuthology_branch: firefly
verbose: true
worker_log: /var/lib/teuthworker/archive/worker_logs/worker.vps.17009
description: upgrade/dumpling-x/stress-split/{0-cluster/start.yaml 1-dumpling-install/dumpling.yaml
  2-partial-upgrade/firsthalf.yaml 3-thrash/default.yaml 4-mon/mona.yaml 5-workload/readwrite.yaml
  6-next-mon/monb.yaml 7-workload/rados_api_tests.yaml 8-next-mon/monc.yaml 9-workload/{rados_api_tests.yaml
  rbd-python.yaml rgw-s3tests.yaml snaps-many-objects.yaml} distros/rhel_6.4.yaml}
duration: 11753.726381063461
failure_reason: 'Command crashed: ''CEPH_CLIENT_ID=0 adjust-ulimits ceph-coverage
  /home/ubuntu/cephtest/archive/coverage ceph_test_rados --op read 45 --op write 45
  --op delete 10 --max-ops 4000 --objects 500 --max-in-flight 16 --size 4000000 --min-stride-size
  400000 --max-stride-size 800000 --max-seconds 0 --pool unique_pool_0'''
flavor: basic
owner: scheduled_teuthology@teuthology
success: false

Related issues 1 (0 open1 closed)

Related to Ceph - Bug #8162: osd: dumpling advances last_backfill prematurelyResolvedSamuel Just04/19/2014

Actions
Actions #1

Updated by Samuel Just about 10 years ago

  • Priority changed from High to Urgent
Actions #2

Updated by Samuel Just about 10 years ago

  • Assignee set to Joao Eduardo Luis
Actions #3

Updated by Joao Eduardo Luis almost 10 years ago

  • Status changed from New to In Progress
Actions #4

Updated by Sage Weil almost 10 years ago

  • Status changed from In Progress to Can't reproduce
Actions

Also available in: Atom PDF