Actions
Bug #11537
closedlibrbd: crash when two clients try to write to an exclusive locked image
Status:
Resolved
Priority:
Urgent
Assignee:
Jason Dillaman
Target version:
-
% Done:
0%
Source:
Development
Tags:
Backport:
hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Using two instances of rbd bench-write on the same image at the same time:
term1 $ rbd create --image-feature exclusive-lock -s 100 foo term1 $ rbd bench-write foo term2 $ rbd bench-write foo
Results in the first instance crashing:
#0 0x00007fadbc6dae2b in raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/pt-raise.c:41 #1 0x00000000008fac1b in reraise_fatal (signum=6) at global/signal_handler.cc:59 #2 0x00000000008faf74 in handle_fatal_signal (signum=6) at global/signal_handler.cc:109 #3 <signal handler called> #4 0x00007fadbb125165 in *__GI_raise (sig=<value optimized out>) at ../nptl/sysdeps/unix/sysv/linux/raise.c:64 #5 0x00007fadbb127f70 in *__GI_abort () at abort.c:92 #6 0x00007fadbb9b8dc5 in __gnu_cxx::__verbose_terminate_handler() () from /usr/lib/libstdc++.so.6 #7 0x00007fadbb9b7166 in ?? () from /usr/lib/libstdc++.so.6 #8 0x00007fadbb9b7193 in std::terminate() () from /usr/lib/libstdc++.so.6 #9 0x00007fadbb9b728e in __cxa_throw () from /usr/lib/libstdc++.so.6 #10 0x00007fadbf71a99d in ceph::__ceph_assert_fail (assertion=0x7fadbfab06e3 "r == 0", file=0x7fadbfab06bf "./common/RWLock.h", line=71, func=0x7fadbfab0c10 "void RWLock::get_read() const") at common/assert.cc:77 #11 0x00007fadbf517e77 in RWLock::get_read (this=0x7fadb0001568) at common/RWLock.h:71 #12 0x00007fadbf517fa6 in RWLock::RLocker::RLocker(RWLock const&) () from /home/joshd/ceph/src/.libs/librbd.so.1 #13 0x00007fadbf5168d5 in librbd::AbstractWrite::send_pre (this=0x7fadb24c9a10) at librbd/AioRequest.cc:437 #14 0x00007fadbf516864 in librbd::AbstractWrite::send (this=0x7fadb24c9a10) at librbd/AioRequest.cc:426 #15 0x00007fadbf59a126 in librbd::LibrbdWriteback::write (this=0x7fadb0009440, oid=..., oloc=..., off=1900544, len=1040384, snapc=..., bl=..., mtime=..., trunc_size=0, trunc_seq=0, oncommit=0x7fadb16b7a10) at librbd/LibrbdWriteback.cc:168 #16 0x00007fadbfa4e9af in ObjectCacher::bh_write (this=0x7fadb000b140, bh=0x4b3a8f0) at osdc/ObjectCacher.cc:847 #17 0x00007fadbfa56bd0 in ObjectCacher::flush_set (this=0x7fadb000b140, oset=0x7fadb0009ad0, onfinish=0x7fadb0249600) at osdc/ObjectCacher.cc:1722 #18 0x00007fadbf5378d8 in librbd::ImageCtx::flush_cache_aio (this=0x7fadb00013d0, onfinish=0x7fadb0249600) at librbd/ImageCtx.cc:617 #19 0x00007fadbf537991 in librbd::ImageCtx::flush_cache (this=0x7fadb00013d0) at librbd/ImageCtx.cc:627 #20 0x00007fadbf586862 in librbd::_flush (ictx=0x7fadb00013d0) at librbd/internal.cc:3719 #21 0x00007fadbf54adbf in librbd::ImageWatcher::release_lock (this=0x7fadb000b6f0) at librbd/ImageWatcher.cc:382 #22 0x00007fadbf54cf12 in librbd::ImageWatcher::notify_release_lock (this=0x7fadb000b6f0) at librbd/ImageWatcher.cc:583 #23 0x00007fadbf561dcf in boost::_mfi::mf0<void, librbd::ImageWatcher>::operator()(librbd::ImageWatcher*) const () from /home/joshd/ceph/src/.libs/librbd.so.1 #24 0x00007fadbf55fbee in void boost::_bi::list1<boost::_bi::value<librbd::ImageWatcher*> >::operator()<boost::_mfi::mf0<void, librbd::ImageWatcher>, boost::_bi::list1<int&> >(boost::_bi::type<void>, boost::_mfi::mf0<void, librbd::ImageWatcher>&, boost::_bi::list1<int&>&, int) () from /home/joshd/ceph/src/.libs/librbd.so.1 #25 0x00007fadbf55d6e0 in void boost::_bi::bind_t<void, boost::_mfi::mf0<void, librbd::ImageWatcher>, boost::_bi::list1<boost::_bi::value<librbd::ImageWatcher*> > >::operator()<int>(int&) () from /home/joshd/ceph/src/.libs/librbd.so.1 #26 0x00007fadbf55ae45 in boost::detail::function::void_function_obj_invoker1<boost::_bi::bind_t<void, boost::_mfi::mf0<void, librbd::ImageWatcher>, boost::_bi::list1<boost::_bi::value<librbd::ImageWatcher*> > >, void, int>::invoke(boost::detail::function::function_buffer&, int) () from /home/joshd/ceph/src/.libs/librbd.so.1 #27 0x00007fadbf518df1 in boost::function1<void, int>::operator()(int) const () from /home/joshd/ceph/src/.libs/librbd.so.1 #28 0x00007fadbf5183f6 in FunctionContext::finish(int) () from /home/joshd/ceph/src/.libs/librbd.so.1 #29 0x00007fadbf5110e5 in Context::complete(int) () from /home/joshd/ceph/src/.libs/librbd.so.1 #30 0x00007fadbf564709 in librbd::TaskFinisher<librbd::ImageWatcher::Task>::complete(librbd::ImageWatcher::Task const&) () from /home/joshd/ceph/src/.libs/librbd.so.1 #31 0x00007fadbf564614 in librbd::TaskFinisher<librbd::ImageWatcher::Task>::C_Task::finish(int) () from /home/joshd/ceph/src/.libs/librbd.so.1 #32 0x00007fadbf5110e5 in Context::complete(int) () from /home/joshd/ceph/src/.libs/librbd.so.1 #33 0x00007fadbf71bc77 in Finisher::finisher_thread_entry (this=0x7fadb000b9d0) at common/Finisher.cc:59 #34 0x00007fadbf53b8b4 in Finisher::FinisherThread::entry() () from /home/joshd/ceph/src/.libs/librbd.so.1 #35 0x00007fadbf61bcca in Thread::entry_wrapper (this=0x7fadb000bae8) at common/Thread.cc:84 #36 0x00007fadbf61bc18 in Thread::_entry_func (arg=0x7fadb000bae8) at common/Thread.cc:66 #37 0x00007fadbc6d28ba in start_thread (arg=<value optimized out>) at pthread_create.c:300 #38 0x00007fadbb1c202d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #39 0x0000000000000000 in ?? ()
This may be fixed by the locking cleanups already. But the 2nd instance hangs, not making progress instead of breaking the lock.
Updated by Jason Dillaman almost 9 years ago
- Status changed from New to In Progress
- Assignee set to Jason Dillaman
Updated by Jason Dillaman almost 9 years ago
- Status changed from In Progress to Fix Under Review
master PR: https://github.com/ceph/ceph/pull/4695
Updated by Jason Dillaman almost 9 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to hammer
Similar issue occurs in Hammer -- causes deadlock instead of a crash due to differences in locking addressed in PR4528
Updated by Loïc Dachary over 8 years ago
- Status changed from Pending Backport to Resolved
Actions