Project

General

Profile

Actions

Bug #2956

closed

osd:FAILED assert(waiting_for_ondisk.begin()->first == repop->v)

Added by Tamilarasi muthamizhan over 11 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Logs: ubuntu@teuthology:/a/teuthology-2012-08-15_19:00:16-regression-master-testing-gcov/1878

2012-08-15 21:35:20.030298 7ff87c176700 -1 osd/ReplicatedPG.cc: In function 'void ReplicatedPG::eval_repo
p(ReplicatedPG::RepGather*)' thread 7ff87c176700 time 2012-08-15 21:35:19.853043
osd/ReplicatedPG.cc: 3547: FAILED assert(waiting_for_ondisk.begin()->first == repop->v)

 ceph version 0.50-182-g08b8bba (commit:08b8bba433e6471eb76b3ed8dd6b23fbbf796af3)
 1: (ReplicatedPG::eval_repop(ReplicatedPG::RepGather*)+0x397) [0x56f857]
 2: (ReplicatedPG::repop_ack(ReplicatedPG::RepGather*, int, int, int, eversion_t)+0x21c) [0x57192c]
 3: (ReplicatedPG::sub_op_modify_reply(std::tr1::shared_ptr<OpRequest>)+0x22a) [0x572a7a]
 4: (ReplicatedPG::do_sub_op_reply(std::tr1::shared_ptr<OpRequest>)+0x84) [0x5ca5a4]
 5: (PG::do_request(std::tr1::shared_ptr<OpRequest>)+0x404) [0x6f4794]
 6: (OSD::dequeue_op(PG*)+0x304) [0x611f94]
 7: (OSD::OpWQ::_process(PG*)+0x15) [0x678445]
 8: (ThreadPool::WorkQueue<PG>::_void_process(void*)+0x12) [0x66ea32]
 9: (ThreadPool::worker()+0x4db) [0x8f396b]
 10: (ThreadPool::WorkThread::entry()+0x15) [0x66ff65]
 11: (Thread::_entry_func(void*)+0x12) [0x8e6412]
 12: (()+0x7e9a) [0x7ff88c7ebe9a]
 13: (clone()+0x6d) [0x7ff88ada04bd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
ubuntu@teuthology:/a/teuthology-2012-08-15_19:00:16-regression-master-testing-gcov/1878$ cat config.yaml 
kernel: &id001
  kdb: true
  sha1: 1fe5e9932156f6122c3b1ff6ba7541c27c86718c
nuke-on-error: true
overrides:
  ceph:
    coverage: true
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: 08b8bba433e6471eb76b3ed8dd6b23fbbf796af3
  workunit:
    sha1: 08b8bba433e6471eb76b3ed8dd6b23fbbf796af3
roles:
- - mon.a
  - mon.c
  - osd.0
  - osd.1
  - osd.2
- - mon.b
  - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana25.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDuXajaQgHe9XnbLOzI8WWFYVz6+TnOiTzbkIJPGOZpzQEjnUtJraQIEt5ABSeovMjiEj+V4XvunfyuSmEd0H9giRSyjmCHTPGlpndfTeCdVtCBpNqf5GkUqHaEY1Hp57XPbya2rGlwtFm0NeIDYx6pfkejKnsTOUqwhgUb6950TRhjHQhMjFgyALSyfAm/4y6vGZfjm57+yyih6XgDkqWiiQ6Y/aJVR2n+iCzvqEzV7JSCU+Brn+k8IQLHho1fadYqc5PjYct5BaVlHcP6c+T8nJE/DvqGwZ4gQaVJcuWJiDfLOPPYo1g/0AFicxauLwVNJ6HFR9FjLLGtGU+2DcVN
  ubuntu@plana36.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC9Ru6XkJBGiUQK9AtlFt82TzpaWuKams26i0FcItt3hbniR1yxpWVHM3dQI5Gft3liumnOD+cPZiZJzGYyj2KDBCZ8G9V65YqCbzO+moJmv5wDWKg1pEIIW040aLrlsOPbZlEL7htT14MHTTstyTQCOLkrySCpexwYrA2wQBhsHc7pxL+XLa+WM1zTXSQe6QrS8iYxITGRibEMSjcXlOuLFnst42O6o4WQHd31WS9pbniBmso7KVgTFxmcN5rvEo1YAJJYwVxGfmorWrXan1ULY6CksasatbCuohmVNNZfsnE8KdyYsPYCbKIPp9NnmBL3Pp/oPqqyPsj36Wgj5e4/
  ubuntu@plana37.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQDrxOb9f5/SfItd83HOnLVyJRnfji0fbdvL+3T82akjV6J4s/nyR8Bu+rpXbyUwu2BRDoxK4pT2dBqw86meq1qbU5Q1ypWBSH41MYGd213fy0g8YibFiYVGmXFCSwtY8X2Pet9vtLDoYvtnsgNI8djy5GPkQyZFKSszJHznZvQU10NWfM6RfxxtsBKXC/aot4QXb3GIym2/EmeuTAAef6p98dd15P9l9HQkpwXZLwiDZ53IbU79CTINo5HTD/6+1XHUcjb1OUKzQMx1jU485gW6IlsR0G0jJKSv+YEu4zSxxva7gWt1AYxGo2jhNDffEGLsNurzXFf9yeYshCTAszLf
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds: null
- kclient: null
- workunit:
    clients:
      all:
      - suites/ffsb.sh
ubuntu@teuthology:/a/teuthology-2012-08-15_19:00:16-regression-master-testing-gcov/1878$ cat summary.yaml 
ceph-sha1: 08b8bba433e6471eb76b3ed8dd6b23fbbf796af3
client.0-kernel-sha1: 1fe5e9932156f6122c3b1ff6ba7541c27c86718c
description: collection:kernel-thrash clusters:fixed-3.yaml fs:btrfs.yaml thrashers:default.yaml
  workloads:kclient_workunit_suites_ffsb.yaml
duration: 824.2622230052948
failure_reason: 'Command failed with status 1: ''/tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage
  /tmp/cephtest/archive/coverage /tmp/cephtest/daemon-helper term /tmp/cephtest/binary/usr/local/bin/ceph-osd
  -f -i 2 -c /tmp/cephtest/ceph.conf'''
flavor: gcov
mon.a-kernel-sha1: 1fe5e9932156f6122c3b1ff6ba7541c27c86718c
mon.b-kernel-sha1: 1fe5e9932156f6122c3b1ff6ba7541c27c86718c
owner: scheduled_teuthology@teuthology
success: false

Related issues 1 (0 open1 closed)

Is duplicate of Ceph - Bug #3072: osd/ReplicatedPG.cc: 3548: FAILED assert(waiting_for_ondisk.begin()->first == repop->v) Resolved09/04/2012

Actions
Actions #1

Updated by Sage Weil over 11 years ago

  • Priority changed from Normal to Urgent
ubuntu@teuthology:/a/sage-2012-08-20_09:17:16-rados-master-testing-next-basic$ cat 5116/summary.yaml 
ceph-sha1: cfe211af138db2d309a8691d8629c5c12926a6f1
client.0-kernel-sha1: dff193ce4b08151b6d01fc99491b571c61efd44d
description: collection:thrash clusters:6-osd-3-machine.yaml fs:btrfs.yaml msgr-failures:few.yaml
  thrashers:default.yaml workloads:radosbench.yaml
duration: 2956.7149050235748
failure_reason: 'Command failed with status 1: ''/tmp/cephtest/enable-coredump /tmp/cephtest/binary/usr/local/bin/ceph-coverage
  /tmp/cephtest/archive/coverage /tmp/cephtest/daemon-helper kill /tmp/cephtest/binary/usr/local/bin/ceph-osd
  -f -i 0 -c /tmp/cephtest/ceph.conf'''
flavor: basic
mds.a-kernel-sha1: dff193ce4b08151b6d01fc99491b571c61efd44d
mon.a-kernel-sha1: dff193ce4b08151b6d01fc99491b571c61efd44d
owner: scheduled_sage@metropolis
success: false
ubuntu@teuthology:/a/sage-2012-08-20_09:17:16-rados-master-testing-next-basic$ cd 5116
ubuntu@teuthology:/a/sage-2012-08-20_09:17:16-rados-master-testing-next-basic/5116$ cat config.yaml 
kernel: &id001
  kdb: true
  sha1: dff193ce4b08151b6d01fc99491b571c61efd44d
nuke-on-error: true
overrides:
  ceph:
    conf:
      global:
        debug ms: 20
        ms inject socket failures: 5000
    fs: btrfs
    log-whitelist:
    - slow request
    sha1: cfe211af138db2d309a8691d8629c5c12926a6f1
  workunit:
    sha1: cfe211af138db2d309a8691d8629c5c12926a6f1
roles:
- - mon.a
  - osd.0
  - osd.1
  - osd.2
- - mds.a
  - osd.3
  - osd.4
  - osd.5
- - client.0
targets:
  ubuntu@plana47.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCUMS+/Rfo92n0pY5cDrv+M9lss9i6+Zum4aa4aE54KsOcKkl+6yooZcZL8bllGLVL1W7BkaBOJ59dQwTVIo/UAgiKyA4J5IVwBPjwNNp4/mXzKJtKQPj0UrTCKsQrKasWPC+FVRzqJRK70cgC5D40znuopmfmENoPwCniOJALFCw3q8XLkcq1SH0jzDXJdsrnTVGxwRHYq9cF9J7fr6XZQXuAk7XO3jG1eqlF8xljmkvI0Ftux50TkOsDzpkscD5jHkxiFj/gkO2KR5GNbybdnxllHBAYuv2hoxrsW2oyIxbeforwZFV0DcDhRReRTx8BhXZ0o5erZgPgzS+ZbfWol
  ubuntu@plana75.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQC/sBKIbaWlkUEbStD0wYVUj2aEuiP8WB0B4h4oyzOJaWaKSTPAK2hzAxEDVOkG1JhpR2JrfXitDtA7MW48NvP77Ov/EvOnTHBeTE7mvWL0D2d4/YUoqhF+RLojHgFNOE0FsVEc/2rhARYX9/4VL5YQ1kaE4dKeRqLxn/eA6BoW5+NDbdQ1Bt6qWNSTXYC2qs09do6wUXHbB+KE1Obay4QTGf77QA+ueVnAnKmYym5c5kGMqb7DD+I/OZyUcOWTCQ4sDpo2nh0GpHATqAAWXeFMSpJ0sVQmR5ByTpKsoRV3QxmxlNHBJVDrBoGbw7O0z8AisuwOfqzrOO5M3Q+16Gen
  ubuntu@plana79.front.sepia.ceph.com: ssh-rsa AAAAB3NzaC1yc2EAAAADAQABAAABAQCx0nVMVy140vXGRPqjqx63mfytPlqmoN7YoJ3Si0ti1XtvJTftB9EdQGwqj/tsY95DeUNBtAQs5TBsiLr1E/JHlKt7EXwyWsJNB2ntvkPJOMxoounypjkVgfv91EWmERQGFsalDmIYjSuSCG28g5Vaz8il9D7fH/ykKZ38EQChhPXIpB2bieJOr2Xm6llde1q2rUEltV17EmiQvu9eUuxb9y9h057k6GSqpsTViPADlT7CG7W60bqWs8d7TvV4rvPhUy6oyUp1ar8116NMSFUiaTgVTidDiQ3xyZeguwJAbzh86MQdHVhSi89W/vjvoEP1opjZP3RArB4BoNwzz/Dh
tasks:
- internal.lock_machines: 3
- internal.save_config: null
- internal.check_lock: null
- internal.connect: null
- internal.check_conflict: null
- kernel: *id001
- internal.base: null
- internal.archive: null
- internal.coredump: null
- internal.syslog: null
- internal.timer: null
- chef: null
- clock: null
- ceph:
    log-whitelist:
    - wrongly marked me down
    - objects unfound and apparently lost
- thrashosds:
    timeout: 1200
- radosbench:
    clients:
    - client.0
    time: 1800
Actions #2

Updated by Sage Weil over 11 years ago

  • Assignee set to Sage Weil
Actions #3

Updated by Sage Weil over 11 years ago

  • Status changed from New to 7
Actions #4

Updated by Sage Weil over 11 years ago

  • Target version set to v0.51
Actions #5

Updated by Sage Weil over 11 years ago

  • Target version changed from v0.51 to 83
Actions #6

Updated by Sage Weil over 11 years ago

  • Status changed from 7 to Resolved
Actions #7

Updated by Sage Weil over 11 years ago

  • Target version changed from 83 to v0.52a
Actions

Also available in: Atom PDF