Project

General

Profile

Actions

Bug #2823

closed

osd: out of order ACKs

Added by Sage Weil almost 12 years ago. Updated almost 12 years ago.

Status:
Duplicate
Priority:
High
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description


2012-07-22T22:22:59.279 INFO:teuthology.task.rados.rados.0.out:7: Writing initial 7
2012-07-22T22:22:59.279 INFO:teuthology.task.rados.rados.0.out:waiting_on = 4
2012-07-22T22:22:59.279 INFO:teuthology.task.rados.rados.0.out:Writing plana8231399-7 from 0 to 55 tid 1 ranges are [0~55,696621~575049,1903405~770015,3073655~649760]
2012-07-22T22:22:59.285 INFO:teuthology.task.rados.rados.0.out:finishing write tid 1 to plana8231399-7
2012-07-22T22:22:59.293 INFO:teuthology.task.rados.rados.0.out:finishing write tid 3 to plana8231399-6
2012-07-22T22:22:59.316 INFO:teuthology.task.rados.rados.0.out:Writing plana8231399-7 from 696621 to 1271670 tid 2 ranges are [0~55,696621~575049,1903405~770015,3073655~649760]
2012-07-22T22:22:59.319 INFO:teuthology.task.rados.rados.0.out:finishing write tid 1 to plana8231399-5
2012-07-22T22:22:59.319 INFO:teuthology.task.rados.rados.0.err:Error: finished tid 1 when last_acked_tid was 2
2012-07-22T22:22:59.320 INFO:teuthology.task.rados.rados.0.err:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7f81627fc700 time 2012-07-22 22:22:59.304828
2012-07-22T22:22:59.320 INFO:teuthology.task.rados.rados.0.err:./test/osd/RadosModel.h: 822: FAILED assert(0)
2012-07-22T22:22:59.320 INFO:teuthology.task.rados.rados.0.err: ceph version 0.48argonaut-496-g6eed767 (commit:6eed7678bfad69cda7ff405697ee3c1436cdb5e6)
2012-07-22T22:22:59.320 INFO:teuthology.task.rados.rados.0.err: 1: (WriteOp::_finish(TestOp::CallbackInfo*)+0x2a1) [0x413451]
2012-07-22T22:22:59.320 INFO:teuthology.task.rados.rados.0.err: 2: (write_callback(void*, void*)+0x21) [0x420f01]
2012-07-22T22:22:59.320 INFO:teuthology.task.rados.rados.0.err: 3: (librados::C_AioComplete::finish(int)+0x1d) [0x7f816c76fb0d]
2012-07-22T22:22:59.320 INFO:teuthology.task.rados.rados.0.err: 4: (Finisher::finisher_thread_entry()+0x218) [0x7f816c7c9a08]
2012-07-22T22:22:59.320 INFO:teuthology.task.rados.rados.0.err: 5: (()+0x7e9a) [0x7f816c433e9a]
2012-07-22T22:22:59.321 INFO:teuthology.task.rados.rados.0.err: 6: (clone()+0x6d) [0x7f816bc4b4bd]
2012-07-22T22:22:59.321 INFO:teuthology.task.rados.rados.0.err: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
2012-07-22T22:22:59.321 INFO:teuthology.task.rados.rados.0.err:terminate called after throwing an instance of 'ceph::FailedAssertion'
2012-07-22T22:22:59.436 ERROR:teuthology.run_tasks:Manager failed: <contextlib.GeneratorContextManager object at 0x1b27b90>
Traceback (most recent call last):
  File "/var/lib/teuthworker/teuthology/teuthology/run_tasks.py", line 45, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.6/contextlib.py", line 23, in __exit__
    self.gen.next()
  File "/var/lib/teuthworker/teuthology/teuthology/task/rados.py", line 97, in task
    run.wait(tests.itervalues())
  File "/var/lib/teuthworker/teuthology/teuthology/orchestra/run.py", line 272, in wait
    proc.exitstatus.get()
  File "/var/lib/teuthworker/teuthology/virtualenv/lib/python2.6/site-packages/gevent/event.py", line 223, in get
    raise self._exception
CommandCrashedError: Command crashed: 'CEPH_CLIENT_ID=0 CEPH_CONF=/tmp/cephtest/ceph.conf LD_LIBRARY_PATH=/tmp/cephtest/binary/usr/local/lib /tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage /tmp/cephtest/archive/coverage /tmp/cephtest/binary/usr/
local/bin/testrados --op read 45 --op write 45 --op delete 10 --op snap_create 0 --op snap_remove 0 --op rollback 0 --op setattr 0 --op rmattr 0 --op watch 0 --max-ops 4000 --objects 500 --max-in-flight 16 --size 4000000 --min-stride-size 400000 --max-stride-size 800000 --max
-seconds 0'

ubuntu@teuthology:/a/sage-2012-07-22_21:26:14-regression-wip-msgr-testing-basic/16249$ cat config.yaml 
kernel: &id001
  kdb: true
  sha1: 77dca1ac33894de22b1740bb9cf6b8ef6429c700
nuke-on-error: true
overrides:
  ceph:
    conf:
      global:
        ms inject socket failures: 200
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: 6eed7678bfad69cda7ff405697ee3c1436cdb5e6
  workunit:
    sha1: 6eed7678bfad69cda7ff405697ee3c1436cdb5e6
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
- - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana69.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDE4lTHELBl2d+BXuaxMKtJX4r3GN2qSaxHTfewZxutqEC+rSNbD5Otiqwm16GtmCYklbYJL0yr6mizCZ0KzL3lVwNSczb/CxfvLTLegmBv9YzPRvoL1ymxdHggPIw47JTNRg6+UO9EConQpp7LOktT7fYf3AFx/BS9Ux6UeKyxifkb/1GUlzVT2rk1D39frBhI7AcPHovvbd/6uHvamrpUS3wSDWdz6BNRTQDaK307TJS9LPva6Cj6WAgmTZZesjWB4mOpWywrA9sva7GqSrnTMGKnIHPcsI6U57CALwHS1g/zIz73QuolZA2NkbhA7jjrfsUaWYJorE9dynIga9pf
  ubuntu@plana73.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDAZtaJPPhGrqcL1tRUHvgMMykh4hU2IdrX9cC2hYhotYVpzeejuA2TkmRdFhno54E6BM4dRIh3Wqv95YXpsToxOvzGp46lmRl3aNrLeWEEE5KxwgwTWuK72vMpZTuD/8XClj/faeFYuKvdiFFK0jGiX+j74HBOOrl3A3WdCi+/xFJZ/b5vZUJRFZxEkAkpy868iebiMgMHWDzs9A8GVb35MGf8nQGwFtPQlHEtgQNK51Wm7rj3qJM1tP1ZL2ghtrck9mdLBUzo6Fj1YTifewKt/PhteEZ1aP8lMpakOoRrJWawM27XQ7XJZDldZcI4REvF3suJLbiQHzdGArUTG3oz
  ubuntu@plana82.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDKbuUXtRWySmQ24tUH6g2Y3MU0N5+VTNSxHzX5QhVnLUY2djFFUHUhivJGutTfIOdIpw+jQP2fwQoryqC/oqSKyE2fwunP+M1ZEobQILQFthKeUKpDH6rC1gnaJI6CU3k0voVUTQpHUyKLApO5A4knXLfzmPJeE4lbi8Tds9Kx1TtID4+R8Z0UgJgfu8LVsMw8225hvuE0lOG+2n3Ms97uLh9MIxuELm4HAerp2etlVQVeE99udYeST+LMhrltxFRdKD8WUiuycIrzaT9D446ELfN+Vlugh6jSwbJycCzeqfEcPBEKNd4A9oPBThR7T0iHKRNGXHv7QhwpEgPUTT8f
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- rados:
    clients:
    - client.0
    objects: 500
    op_weights:
      delete: 10
      read: 45
      write: 45
    ops: 4000

ubuntu@teuthology:/a/sage-2012-07-22_21:26:14-regression-wip-msgr-testing-basic/16251 too

Related issues 2 (0 open2 closed)

Related to Ceph - Bug #2803: filer: probe crashResolvedSage Weil07/19/2012

Actions
Is duplicate of Ceph - Bug #2947: osd: out of order replyResolvedSage Weil08/14/2012

Actions
Actions

Also available in: Atom PDF