Project

General

Profile

Actions

Bug #3631

closed

osdc/ObjectCacher.cc: 834: FAILED assert(ob->last_commit_tid < tid) during librbd_fsx

Added by Sage Weil over 11 years ago. Updated about 11 years ago.

Status:
Resolved
Priority:
High
Assignee:
Category:
OSD
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

old symptom, presumably new bug.

2012-12-16T04:22:17.548 INFO:teuthology.orchestra.run.err:osdc/ObjectCacher.cc: In function 'void ObjectCacher::bh_write_commit(int64_t, sobject_t, loff_t, uint64_t, tid_t, int)' thread 7f6b6ffff700 time 2012-12-16 04:21:45.199523
2012-12-16T04:22:17.553 INFO:teuthology.orchestra.run.err:osdc/ObjectCacher.cc: 834: FAILED assert(ob->last_commit_tid < tid)
2012-12-16T04:22:17.554 INFO:teuthology.orchestra.run.err: ceph version 0.55-304-g4bf9078 (4bf9078286d58c2cd4e85cb8b31411220a377092)
2012-12-16T04:22:17.554 INFO:teuthology.orchestra.run.err: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x95) [0x7f6b82f1ed0d]
2012-12-16T04:22:17.554 INFO:teuthology.orchestra.run.err: 2: (ObjectCacher::bh_write_commit(long, sobject_t, long, unsigned long, unsigned long, int)+0xda7) [0x7f6b83d58cef]
2012-12-16T04:22:17.554 INFO:teuthology.orchestra.run.err: 3: (ObjectCacher::C_WriteCommit::finish(int)+0x6e) [0x7f6b83d6615c]
2012-12-16T04:22:17.554 INFO:teuthology.orchestra.run.err: 4: (Context::complete(int)+0x2b) [0x7f6b83d17913]
2012-12-16T04:22:17.554 INFO:teuthology.orchestra.run.err: 5: (librbd::C_Request::finish(int)+0x142) [0x7f6b83d504ec]
2012-12-16T04:22:17.554 INFO:teuthology.orchestra.run.err: 6: (Context::complete(int)+0x2b) [0x7f6b83d17913]
2012-12-16T04:22:17.554 INFO:teuthology.orchestra.run.err: 7: (librbd::AioRequest::complete(int)+0x6f) [0x7f6b83d179a3]
2012-12-16T04:22:17.555 INFO:teuthology.orchestra.run.err: 8: (librbd::rados_req_cb(void*, void*)+0x34) [0x7f6b83d4632e]
2012-12-16T04:22:17.555 INFO:teuthology.orchestra.run.err: 9: (librados::C_AioSafe::finish(int)+0x4d) [0x7f6b82ea94b1]
2012-12-16T04:22:17.555 INFO:teuthology.orchestra.run.err: 10: (Finisher::finisher_thread_entry()+0x342) [0x7f6b82f1de7e]
2012-12-16T04:22:17.555 INFO:teuthology.orchestra.run.err: 11: (Finisher::FinisherThread::entry()+0x1c) [0x7f6b82e92704]
2012-12-16T04:22:17.555 INFO:teuthology.orchestra.run.err: 12: (Thread::_entry_func(void*)+0x23) [0x7f6b830bf979]
2012-12-16T04:22:17.556 INFO:teuthology.orchestra.run.err: 13: (()+0x7e9a) [0x7f6b81e93e9a]
2012-12-16T04:22:17.556 INFO:teuthology.orchestra.run.err: 14: (clone()+0x6d) [0x7f6b8219b4bd]
2012-12-16T04:22:17.556 INFO:teuthology.orchestra.run.err: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

job was
ubuntu@teuthology:/a/sage-2012-12-15_21:09:08-regression-next-testing-basic/16500$ cat orig.config.yaml 
kernel:
  kdb: true
  sha1: ec18aeecd4de479601363849d489668d8f12410c
nuke-on-error: true
overrides:
  ceph:
    conf:
      client:
        rbd cache: true
      global:
        ms inject socket failures: 5000
    fs: ext4
    log-whitelist:
    - slow request
    sha1: 4bf9078286d58c2cd4e85cb8b31411220a377092
  s3tests:
    branch: next
  workunit:
    sha1: 4bf9078286d58c2cd4e85cb8b31411220a377092
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
- - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
tasks:
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- rbd_fsx:
    clients:
    - client.0
    ops: 2000


Related issues 1 (0 open1 closed)

Has duplicate Ceph - Bug #3444: Qemu librbd aioDuplicate11/05/2012

Actions
Actions #1

Updated by Ian Colle over 11 years ago

  • Assignee changed from Sage Weil to Samuel Just
Actions #2

Updated by Sage Weil over 11 years ago

  • Assignee changed from Samuel Just to Josh Durgin
Actions #3

Updated by Sage Weil over 11 years ago

  • Priority changed from Urgent to High
Actions #4

Updated by Tamilarasi muthamizhan over 11 years ago

ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28662

2012-12-27T20:59:22.673 INFO:teuthology.task.thrashosds.thrasher:Adding osd 1
2012-12-27T20:59:22.673 DEBUG:teuthology.orchestra.run:Running: 'LD_LIBRARY_PRELOAD=/tmp/cephtest/binary/usr/local/lib /tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage /tmp/cephtest/archive/coverage /tmp/cephtest/binary/usr/local/bin/ceph -k /tmp/cephtest/ceph.keyring -c /tmp/cephtest/ceph.conf --concise osd in 1'
2012-12-27T20:59:25.142 INFO:teuthology.task.rados.rados.0.out:finishing write tid 4 to plana4522453-42
2012-12-27T20:59:25.142 INFO:teuthology.task.rados.rados.0.out:finishing write tid 1 to plana4522453-42
2012-12-27T20:59:25.143 INFO:teuthology.task.rados.rados.0.err:Error: finished tid 1 when last_acked_tid was 4
2012-12-27T20:59:25.187 INFO:teuthology.task.rados.rados.0.err:./test/osd/RadosModel.h: In function 'virtual void WriteOp::_finish(TestOp::CallbackInfo*)' thread 7fc8b3fff700 time 2012-12-27 20:58:40.325822
2012-12-27T20:59:25.187 INFO:teuthology.task.rados.rados.0.err:./test/osd/RadosModel.h: 823: FAILED assert(0)
2012-12-27T20:59:25.187 INFO:teuthology.task.rados.rados.0.err: ceph version 0.55.1-362-gc0fe381 (c0fe3815567e7d89b0abc557eb72cb6a831540c8)
2012-12-27T20:59:25.188 INFO:teuthology.task.rados.rados.0.err: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x95) [0x7fc8c682f07d]
2012-12-27T20:59:25.188 INFO:teuthology.task.rados.rados.0.err: 2: (WriteOp::_finish(TestOp::CallbackInfo*)+0x143) [0x446adb]
2012-12-27T20:59:25.188 INFO:teuthology.task.rados.rados.0.err: 3: (TestOp::finish(TestOp::CallbackInfo*)+0x2e) [0x4665b8]
2012-12-27T20:59:25.188 INFO:teuthology.task.rados.rados.0.err: 4: (write_callback(void*, void*)+0x42) [0x466627]
2012-12-27T20:59:25.188 INFO:teuthology.task.rados.rados.0.err: 5: (librados::C_AioComplete::finish(int)+0x4d) [0x7fc8c67b9731]
2012-12-27T20:59:25.188 INFO:teuthology.task.rados.rados.0.err: 6: (Finisher::finisher_thread_entry()+0x342) [0x7fc8c682e1ee]
2012-12-27T20:59:25.188 INFO:teuthology.task.rados.rados.0.err: 7: (Finisher::FinisherThread::entry()+0x1c) [0x7fc8c67a2a74]
2012-12-27T20:59:25.188 INFO:teuthology.task.rados.rados.0.err: 8: (Thread::_entry_func(void*)+0x23) [0x7fc8c69cf91d]
2012-12-27T20:59:25.188 INFO:teuthology.task.rados.rados.0.err: 9: (()+0x7e9a) [0x7fc8c5e5be9a]
2012-12-27T20:59:25.189 INFO:teuthology.task.rados.rados.0.err: 10: (clone()+0x6d) [0x7fc8c56734bd]
2012-12-27T20:59:25.189 INFO:teuthology.task.rados.rados.0.err: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28662$ cat summary.yaml 
ceph-sha1: c0fe3815567e7d89b0abc557eb72cb6a831540c8
client.0-kernel-sha1: ec18aeecd4de479601363849d489668d8f12410c
description: collection:rados-thrash clusters:6-osd-3-machine.yaml fs:xfs.yaml msgr-failures:few.yaml
  thrashers:default.yaml workloads:snaps-few-objects.yaml
duration: 1840.5642039775848
failure_reason: 'Command crashed: ''CEPH_CLIENT_ID=0 CEPH_CONF=/tmp/cephtest/ceph.conf
  LD_LIBRARY_PATH=/tmp/cephtest/binary/usr/local/lib /tmp/cephtest/enable-coredump
  /tmp/cephtest/binary/usr/local/bin/ceph-coverage /tmp/cephtest/archive/coverage
  /tmp/cephtest/binary/usr/local/bin/testrados --op read 100 --op write 100 --op delete
  50 --op snap_create 50 --op snap_remove 50 --op rollback 0 --op setattr 0 --op rmattr
  0 --op watch 0 --max-ops 4000 --objects 50 --max-in-flight 16 --size 4000000 --min-stride-size
  400000 --max-stride-size 800000 --max-seconds 0'''
flavor: basic
mds.a-kernel-sha1: ec18aeecd4de479601363849d489668d8f12410c
mon.a-kernel-sha1: ec18aeecd4de479601363849d489668d8f12410c
owner: scheduled_teuthology@teuthology
success: false
ubuntu@teuthology:/a/teuthology-2012-12-27_19:00:03-regression-next-testing-basic/28662$ cat config.yaml kernel: &id001
  kdb: true
  sha1: ec18aeecd4de479601363849d489668d8f12410c
nuke-on-error: true
overrides:
  ceph:
    conf:
      global:
        ms inject socket failures: 5000
    fs: xfs
    log-whitelist:
    - slow request
    sha1: c0fe3815567e7d89b0abc557eb72cb6a831540c8
  s3tests:
    branch: next
  workunit:
    sha1: c0fe3815567e7d89b0abc557eb72cb6a831540c8
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
- - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana21.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCWF1ovToj1jAHkUM4IKQ3d0Bn2AzA38jBEZxCcRKBqNq//804bDbt2VOa+5UvmDdcHevcwEHvcMqnAlTh5fb0e8qCadYuJhFFF+8yfAsujTtRpvGqkP5AIe90iNspLizCeHmS7Ej8USlWmFAyUVQ6tvSkiLFJcIDf64y+mkdVdnd0WQ2ArNV9mffq/OfHhmUSVXg9+OnxW1gQMWdMcjDu99VIQUbuZASeH0wwhGvlNHBWseHHUyYuNRbdoe2KF8v41gd3D+pBPCuHcA+QqKgJkUNt4Na+dayC8M5Q44ehJ+L8067385ZN9buTnJbZRIezXY5uw3ANJxWMogNK9m39t
  ubuntu@plana44.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDYE0eu9E8TQwtUy89Wldp54VbNBEoO9XQf77eXXzzmNwYUFRrNX0mZV/I8GqyRJuMrPG8V4aZBthBHTtnEmQ6RAS7fVdthi/hEgwnM9cAqY3KX9mR5xJnHBc/fa5KLrnSr3Wrztf42PpQNEN5Tk55K6wWUlZOTHU3vE0j3kF+YQ5FeBhQbghztHPKFR8bOmZJp9TpbXgbvEM2RWr9bYtro1KuQOgrairyVVNWdAuwZuxSQT4soyHoSkY9JmeXKsNRAOamxH9w57mDC3PXui7r6Fp8OCWSK+GmlLTtPaZtulSCcucaZtpVae7F4s9JNxaRl5RxuUtwMRfgAHGlL2BZv
  ubuntu@plana45.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDp3cwfZhOipCot6NiKX4cRMn4zx43QY0+5HdqzCQU2y7OrOJt3d0qvifnZPyeq8/d+aW2WL2OM8m4taz380JsP0SLmlpY8D0pGY/tN0pQDqIFd8EboMtKY6tR8unQrVzuczMqup/tkKSfdRp0zAeTiJ8qH7l9MaVcOw6WfRACb8f7APJE2gVRBrzPAdbqKzAphTRzZSz0cq722AX7XQDPT2dz7NoTp5Tk7xaQdDu2II+78B1H27IWdyYeonfy17yf9N+IA2Xzna/g5zu8apg7UvzyFmHunLyjr78dhPtR39201A0QJ5x5Qli9/UaB3LwiqnbCiGfx4xWFazdUFzxiD
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- rados:
    clients:
    - client.0
    objects: 50
    op_weights:
      delete: 50
      read: 100
      snap_create: 50
      snap_remove: 50
      snap_rollback: 50
      write: 100
    ops: 4000

Actions #5

Updated by Sage Weil about 11 years ago

  • Status changed from In Progress to Resolved
Actions

Also available in: Atom PDF