Project

General

Profile

Actions

Bug #15345

closed

"RWLock.h: 124: FAILED assert(r == 0)" in rados-jewel-distro-basic-smithi

Added by Yuri Weinstein about 8 years ago. Updated over 7 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Run: http://pulpito.ceph.com/teuthology-2016-03-29_22:00:01-rados-jewel-distro-basic-smithi/
Job: 96371
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2016-03-29_22:00:01-rados-jewel-distro-basic-smithi/96371/teuthology.log

2016-03-31T11:45:23.115 INFO:teuthology.orchestra.run.smithi028.stderr:2016-03-31 18:45:23.116624 7f4bcee064c0  1 journal _open store_test_temp_journal fd 49: 419430400 bytes, block size 4096 bytes, directio = 1, aio = 0
2016-03-31T11:45:23.115 INFO:teuthology.orchestra.run.smithi028.stderr:2016-03-31 18:45:23.117291 7f4bcee064c0  1 filestore(store_test_temp_dir) upgrade
2016-03-31T11:45:50.800 INFO:teuthology.orchestra.run.smithi028.stderr:./common/RWLock.h: In function 'void RWLock::get_write(bool)' thread 7f4bc9fec700 time 2016-03-31 18:45:50.799485
2016-03-31T11:45:50.800 INFO:teuthology.orchestra.run.smithi028.stderr:./common/RWLock.h: 124: FAILED assert(r == 0)
2016-03-31T11:45:50.837 INFO:teuthology.orchestra.run.smithi028.stderr: ceph version 10.1.0-265-g88a183b (88a183bd85ed1cab287041d0fa40630e835d4ba0)
2016-03-31T11:45:50.838 INFO:teuthology.orchestra.run.smithi028.stderr: 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x85) [0x7f4bcf3d0ee5]
2016-03-31T11:45:50.838 INFO:teuthology.orchestra.run.smithi028.stderr: 2: (()+0x1e39a6) [0x7f4bcf0069a6]
2016-03-31T11:45:50.838 INFO:teuthology.orchestra.run.smithi028.stderr: 3: (()+0x32c8bb) [0x7f4bcf14f8bb]
2016-03-31T11:45:50.838 INFO:teuthology.orchestra.run.smithi028.stderr: 4: (FileStore::_split_collection(coll_t const&, unsigned int, unsigned int, coll_t, SequencerPosition const&)+0x584) [0x7f4bcf11dfa4]
2016-03-31T11:45:50.838 INFO:teuthology.orchestra.run.smithi028.stderr: 5: (FileStore::_do_transaction(ObjectStore::Transaction&, unsigned long, int, ThreadPool::TPHandle*)+0xf9b) [0x7f4bcf145cdb]
2016-03-31T11:45:50.838 INFO:teuthology.orchestra.run.smithi028.stderr: 6: (FileStore::_do_transactions(std::vector<ObjectStore::Transaction, std::allocator<ObjectStore::Transaction> >&, unsigned long, ThreadPool::TPHandle*)+0x3b) [0x7f4bcf14a3db]
2016-03-31T11:45:50.839 INFO:teuthology.orchestra.run.smithi028.stderr: 7: (FileStore::_do_op(FileStore::OpSequencer*, ThreadPool::TPHandle&)+0x2cd) [0x7f4bcf14a6dd]
2016-03-31T11:45:50.839 INFO:teuthology.orchestra.run.smithi028.stderr: 8: (ThreadPool::worker(ThreadPool::WorkThread*)+0xa7e) [0x7f4bcf3c223e]
2016-03-31T11:45:50.839 INFO:teuthology.orchestra.run.smithi028.stderr: 9: (ThreadPool::WorkThread::entry()+0x10) [0x7f4bcf3c3120]
2016-03-31T11:45:50.839 INFO:teuthology.orchestra.run.smithi028.stderr: 10: (()+0x7dc5) [0x7f4bcd1abdc5]
2016-03-31T11:45:50.839 INFO:teuthology.orchestra.run.smithi028.stderr: 11: (clone()+0x6d) [0x7f4bcc09228d]
2016-03-31T11:45:50.839 INFO:teuthology.orchestra.run.smithi028.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Related issues 1 (0 open1 closed)

Copied to Ceph - Backport #17957: hammer: "RWLock.h: 124: FAILED assert(r == 0)" in rados-jewel-distro-basic-smithiResolvedNathan CutlerActions
Actions #1

Updated by Samuel Just about 8 years ago

  • Priority changed from Normal to Urgent
Actions #2

Updated by Kefu Chai about 8 years ago

crashed in ceph_test_objectstore

#0  0x00007f4bcd1b2fcb in raise () from /lib64/libpthread.so.0
#1  0x00007f4bcf394aa5 in reraise_fatal (signum=6) at global/signal_handler.cc:71
#2  handle_fatal_signal (signum=6) at global/signal_handler.cc:132
#3  <signal handler called>
#4  0x00007f4bcbfd15f7 in raise () from /lib64/libc.so.6
#5  0x00007f4bcbfd2ce8 in abort () from /lib64/libc.so.6
#6  0x00007f4bcf3d10c7 in ceph::__ceph_assert_fail (assertion=assertion@entry=0x7f4bcf51278e "r == 0", file=file@entry=0x7f4bcf501f03 "./common/RWLock.h", line=line@entry=124,
    func=func@entry=0x7f4bcf510360 <_ZZN6RWLock9get_writeEbE19__PRETTY_FUNCTION__> "void RWLock::get_write(bool)") at common/assert.cc:78
#7  0x00007f4bcf0069a6 in RWLock::get_write (lockdep=<optimized out>, this=0x7f4bbc3aa528) at common/RWLock.h:124
#8  0x00007f4bcf14f8bb in RWLock::get_write (this=0x7f4bbc3aa528, lockdep=<optimized out>) at common/RWLock.h:126
#9  0x00007f4bcf11dfa4 in WLocker (lock=..., this=<synthetic pointer>) at common/RWLock.h:183
#10 FileStore::_split_collection (this=this@entry=0x7f4bec7f6550, cid=..., bits=bits@entry=12, rem=rem@entry=2048, dest=..., spos=...) at os/filestore/FileStore.cc:5415
#11 0x00007f4bcf145cdb in FileStore::_do_transaction (this=this@entry=0x7f4bec7f6550, t=..., op_seq=op_seq@entry=4, trans_num=trans_num@entry=0,
    handle=handle@entry=0x7f4bc9feaea0) at os/filestore/FileStore.cc:2800
#12 0x00007f4bcf14a3db in FileStore::_do_transactions (this=this@entry=0x7f4bec7f6550, tls=std::vector of length 1, capacity 1 = {...}, op_seq=4,
    handle=handle@entry=0x7f4bc9feaea0) at os/filestore/FileStore.cc:2109
#13 0x00007f4bcf14a6dd in FileStore::_do_op (this=0x7f4bec7f6550, osr=0x7f4bee07b090, handle=...) at os/filestore/FileStore.cc:1879
#14 0x00007f4bcf3c223e in ThreadPool::worker (this=0x7f4bec7f6ff0, wt=0x7f4befd94810) at common/WorkQueue.cc:128
#15 0x00007f4bcf3c3120 in ThreadPool::WorkThread::entry (this=<optimized out>) at common/WorkQueue.h:440
#16 0x00007f4bcd1abdc5 in start_thread () from /lib64/libpthread.so.0
#17 0x00007f4bcc09228d in clone () from /lib64/libc.so.6

and

(gdb) f 11
#11 0x00007f4bcf145cdb in FileStore::_do_transaction (this=this@entry=0x7f4bec7f6550, t=..., op_seq=op_seq@entry=4, trans_num=trans_num@entry=0,
    handle=handle@entry=0x7f4bc9feaea0) at os/filestore/FileStore.cc:2800
(gdb) p op->cid
$23 = 0
(gdb) p op->dest_cid
$24 = 0
(gdb) thr 10
[Switching to thread 10 (Thread 0x7f4bcee064c0 (LWP 30816))]
#5  0x00007f4bcf07f236 in colsplittest (store=0x7f4bec7f6550, num_objects=10000, common_suffix_size=11) at test/objectstore/store_test.cc:3114
3114        r = apply_transaction(store, &osr, std::move(t));
(gdb) f 5
#5  0x00007f4bcf07f236 in colsplittest (store=0x7f4bec7f6550, num_objects=10000, common_suffix_size=11) at test/objectstore/store_test.cc:3114
(gdb) p *(ObjectStore::Transaction::Op *)0x7f4bd9717410
$142 = {op = 20, cid = 1, oid = 0, off = 0, len = 0, dest_cid = 0, dest_oid = 0, dest_off = 0, hint_type = 0, expected_object_size = 0,
  expected_write_size = 0, split_bits = 12, split_rem = 0}
(gdb) p *(ObjectStore::Transaction::Op *)0x7f4bd9717458
$143 = {op = 36, cid = 0, oid = 0, off = 0, len = 0, dest_cid = 0, dest_oid = 0, dest_off = 0, hint_type = 0, expected_object_size = 0,
  expected_write_size = 0, split_bits = 12, split_rem = 2048}

so both cid and dest_cid are "0" in the split_collection2 op. but the index of "tid" in the second op should have been 1. as the first op shows.

must be something wrong with Transaction::append().

Actions #3

Updated by Kefu Chai about 8 years ago

  • Subject changed from "RWLock.h: 124: FAILED assert(r == 0)" in rados-jewel-distro-basic-smithi to "RWLock.h: 124: FAILED assert(r == 0)" in ceph_test_objectstore
Actions #4

Updated by Kefu Chai about 8 years ago

  • Subject changed from "RWLock.h: 124: FAILED assert(r == 0)" in ceph_test_objectstore to "RWLock.h: 124: FAILED assert(r == 0)" in ceph_test_objectstore (ObjectStore/StoreTest.ColSplitTest1/1)
Actions #5

Updated by Sage Weil about 8 years ago

  • Subject changed from "RWLock.h: 124: FAILED assert(r == 0)" in ceph_test_objectstore (ObjectStore/StoreTest.ColSplitTest1/1) to "RWLock.h: 124: FAILED assert(r == 0)" in rados-jewel-distro-basic-smithi
  • Status changed from New to Resolved

fixed by e3dc7c772f563f97bc68ebc6dc6e0d408e7c11f3

Actions #8

Updated by Nathan Cutler over 7 years ago

  • Backport changed from jewel,hammer to hammer
Actions #9

Updated by Nathan Cutler over 7 years ago

  • Status changed from 12 to Pending Backport
Actions #10

Updated by Nathan Cutler over 7 years ago

  • Copied to Backport #17957: hammer: "RWLock.h: 124: FAILED assert(r == 0)" in rados-jewel-distro-basic-smithi added
Actions #11

Updated by Nathan Cutler over 7 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF