Project

General

Profile

Bug #13969

TestLibRBD.ExclusiveLockTransition Journal.cc: 639: FAILED assert(m_events.empty())

Added by Loic Dachary over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Target version:
-
Start date:
12/03/2015
Due date:
% Done:

0%

Source:
other
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

[ RUN      ] TestLibRBD.ExclusiveLockTransition
using new format!
librbd/Journal.cc: In function 'void librbd::Journal::handle_lock_updated(librbd::ImageWatcher::LockUpdateState)' thread 7fd9a9538700 time 2015-12-03 03:40:44.447992
librbd/Journal.cc: 639: FAILED assert(m_events.empty())
 ceph version 9.2.0-1258-g11dde47 (11dde47ed633916d0eb6a4719ebc53cea952676b)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7fd9bbdfbf9b]
 2: (librbd::Journal::handle_lock_updated(librbd::ImageWatcher::LockUpdateState)+0x35e) [0x7fd9bbcddf2e]
 3: (librbd::ImageWatcher::notify_listeners_updated_lock(librbd::ImageWatcher::LockUpdateState)+0x7d) [0x7fd9bbcaa30d]
 4: (librbd::ImageWatcher::release_lock()+0x22b) [0x7fd9bbcaf12b]
 5: (librbd::ImageWatcher::notify_release_lock()+0x5d) [0x7fd9bbcaf60d]
 6: (FunctionContext::finish(int)+0x1a) [0x7fd9bbba40da]
 7: (Context::complete(int)+0x9) [0x7fd9bbb9e8e9]
 8: (Context::complete(int)+0x9) [0x7fd9bbb9e8e9]
 9: (Finisher::finisher_thread_entry()+0x206) [0x7fd9bbddcc36]
 10: (()+0x7df5) [0x7fd9babf6df5]
 11: (clone()+0x6d) [0x7fd9b9cf91ad]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'

Related issues

Related to rbd - Bug #13976: TestMockOperationSnapshotProtectRequest.Success include/xlist.h: 77: FAILED assert(_size == 0) Duplicate 12/03/2015

Associated revisions

Revision 91f01bdb (diff)
Added by Jason Dillaman over 3 years ago

librbd: partial revert of commit 9b0e359

Fixes: #13969
Signed-off-by: Jason Dillaman <>

Revision fde9f785 (diff)
Added by Jason Dillaman over 3 years ago

librbd: do not complete AIO callbacks within caller's thread context

Avoid rare, racy issues when individual requests associated with an AIO
completion can complete prior to marking the completion as ready-to-fire.
Pre-calculate the expected number of individual requests to avoid the
potential re-entrant callback.

Fixes: #13969
Signed-off-by: Jason Dillaman <>

Revision 6cbf128b (diff)
Added by Jason Dillaman over 3 years ago

tests: wait for mocked requests to complete

Fixes: #13969
Signed-off-by: Jason Dillaman <>

History

#1 Updated by Loic Dachary over 3 years ago

@Jason, this happened today during make check on an unrelate pull request. If it does not ring a bell, feel free to unassign & mark it "Need more info". I'll collect similar failures, if any, or close it in a month with "Can't reproduce".

#4 Updated by Jason Dillaman over 3 years ago

  • Status changed from New to Need Review

#6 Updated by Loic Dachary over 3 years ago

  • Status changed from Need Review to Resolved

#7 Updated by Loic Dachary over 3 years ago

  • Status changed from Resolved to Verified

Re-opening as it seems to re-surface since the fix with variants.

http://jenkins.ceph.dachary.org/job/ceph/LABELS=centos-7&&x86_64/9828/console

[ RUN      ] DiffIterateTest/1.DiffIterateParentDiscard
using new format!
common/lockdep.cc: In function 'int lockdep_will_lock(const char*, int, bool)' thread 7f7faaa12700 time 2015-12-04 01:17:23.445460
common/lockdep.cc: 278: FAILED assert(0)
 ceph version 9.2.0-1313-g0ef97f6 (0ef97f662dda654023a940938deb24ba78714859)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f7fba2fde8b]
 2: (lockdep_will_lock(char const*, int, bool)+0xecc) [0x7f7fba37d30c]
 3: (librbd::CopyupRequest::send_object_map()+0x567) [0x7f7fba1a4c37]
 4: (librbd::CopyupRequest::should_complete(int)+0x6f8) [0x7f7fba1a55e8]
 5: (librbd::CopyupRequest::complete(int)+0x10) [0x7f7fba1a5690]
 6: (FunctionContext::finish(int)+0x1a) [0x7f7fba05d3ca]
 7: (Context::complete(int)+0x9) [0x7f7fba0571a9]
 8: (librbd::rbd_ctx_cb(void*, void*)+0x1e) [0x7f7fba1cb62e]
 9: (librbd::AioCompletion::complete(CephContext*)+0x15f) [0x7f7fba19167f]
 10: (librbd::AioCompletion::finish_adding_requests(CephContext*)+0x92) [0x7f7fba1922e2]
 11: (librbd::AioImageRead::send_request()+0x6b0) [0x7f7fba195bd0]
 12: (librbd::AioImageRequest::send()+0xc9) [0x7f7fba194559]
 13: (librbd::AioImageRequest::aio_read(librbd::ImageCtx*, librbd::AioCompletion*, std::vector<std::pair<unsigned long, unsigned long>, std::allocator<std::pair<unsigned long, unsigned long> > > const&, char*, ceph::buffer::list*, int)+0xd6) [0x7f7fba194976]
 14: (librbd::CopyupRequest::send()+0xa6) [0x7f7fba1a3406]
 15: (librbd::AbstractAioObjectWrite::send_copyup()+0x36b) [0x7f7fba19fccb]
 16: (librbd::AbstractAioObjectWrite::handle_write_guard()+0x1f0) [0x7f7fba19ff40]
 17: (librbd::AbstractAioObjectWrite::should_complete(int)+0x814) [0x7f7fba19f234]
 18: (librbd::AioObjectRequest::complete(int)+0x1a) [0x7f7fba19bf5a]
 19: (()+0x4696fc) [0x7f7fba2766fc]
 20: (FunctionContext::finish(int)+0x1a) [0x7f7fba05d3ca]
 21: (Context::complete(int)+0x9) [0x7f7fba0571a9]
 22: (Finisher::finisher_thread_entry()+0x206) [0x7f7fba2deb26]
 23: (()+0x7df5) [0x7f7fb9055df5]
 24: (clone()+0x6d) [0x7f7fb81581ad]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'
./test/run-rbd-unit-tests.sh: line 10: 13614 Aborted                 (core dumped) RBD_FEATURES=$i unittest_librbd

#8 Updated by Loic Dachary over 3 years ago

  • Related to Bug #13976: TestMockOperationSnapshotProtectRequest.Success include/xlist.h: 77: FAILED assert(_size == 0) added

#9 Updated by Loic Dachary over 3 years ago

http://jenkins.ceph.dachary.org/job/ceph/LABELS=centos-7&&x86_64/9832/console

[ RUN      ] TestMockOperationSnapshotRemoveRequest.RemoveChildParentError
./include/xlist.h: In function 'xlist<T>::~xlist() [with T = librbd::AsyncRequest<librbd::MockImageCtx>*]' thread 7f8bccebe6c0 time 2015-12-04 03:33:36.338007
./include/xlist.h: 77: FAILED assert(_size == 0)
 ceph version 9.2.0-1317-g2a0feae (2a0feae485f1e8323581643fab776530cb2233a9)
 1: (ceph::__ceph_assert_fail(char const*, char const*, int, char const*)+0x8b) [0x7f8bcd3bdd8b]
 2: (xlist<librbd::AsyncRequest<librbd::MockImageCtx>*>::~xlist()+0x3c) [0x7f8bcd15607c]
 3: (librbd::MockImageCtx::~MockImageCtx()+0x15d) [0x7f8bcd15c08d]
 4: (librbd::operation::TestMockOperationSnapshotRemoveRequest_RemoveChildParentError_Test::TestBody()+0x39c) [0x7f8bcd17736c]
 5: (void testing::internal::HandleSehExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x65) [0x7f8bcd393dde]
 6: (void testing::internal::HandleExceptionsInMethodIfSupported<testing::Test, void>(testing::Test*, void (testing::Test::*)(), char const*)+0x4b) [0x7f8bcd38ef8a]
 7: (testing::Test::Run()+0xd5) [0x7f8bcd3766cf]
 8: (testing::TestInfo::Run()+0x108) [0x7f8bcd376ec8]
 9: (testing::TestCase::Run()+0xf4) [0x7f8bcd37758c]
 10: (testing::internal::UnitTestImpl::RunAllTests()+0x298) [0x7f8bcd37e054]
 11: (bool testing::internal::HandleSehExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x65) [0x7f8bcd3951c4]
 12: (bool testing::internal::HandleExceptionsInMethodIfSupported<testing::internal::UnitTestImpl, bool>(testing::internal::UnitTestImpl*, bool (testing::internal::UnitTestImpl::*)(), char const*)+0x4b) [0x7f8bcd38fdd2]
 13: (testing::UnitTest::Run()+0xb4) [0x7f8bcd37cc24]
 14: (main()+0xee) [0x7f8bcd116cce]
 15: (__libc_start_main()+0xf5) [0x7f8bcb14aaf5]
 16: (()+0x248fdd) [0x7f8bcd11cfdd]
 NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.
terminate called after throwing an instance of 'ceph::FailedAssertion'

#10 Updated by Jason Dillaman over 3 years ago

  • Status changed from Verified to In Progress

#11 Updated by Jason Dillaman over 3 years ago

  • Status changed from In Progress to Need Review

#12 Updated by Loic Dachary over 3 years ago

  • Status changed from Need Review to Resolved

#13 Updated by Loic Dachary over 3 years ago

@Jason thanks for being so responsive :-)

Also available in: Atom PDF