Bug #9462
msgr deadlock: osd reply vs mark_down vs fault
0%
Description
Thread 160 (Thread 0x7f0ac2ffd700 (LWP 10067)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162 #1 0x0000000000fb28b4 in Cond::Wait (this=0x7f0ab2c6a228, mutex=...) at ./common/Cond.h:55 #2 0x00000000015e2d11 in Pipe::stop_and_wait (this=0x7f0ab2c6a010) at msg/Pipe.cc:1414 #3 0x00000000014600dd in SimpleMessenger::mark_down (this=0x3b7a890, con=0x7f0ab2c4cdf0) at msg/SimpleMessenger.cc:643 #4 0x00000000015ef80a in PipeConnection::mark_down (this=0x7f0ab2c4cdf0) at msg/PipeConnection.cc:84 #5 0x00000000012c41c8 in Objecter::close_session (this=0x3b942f0, s=0x7f0ab2c689b0) at osdc/Objecter.cc:1288 #6 0x00000000012c050b in Objecter::handle_osd_map (this=0x3b942f0, m=0x7f0ab4a4aeb0) at osdc/Objecter.cc:792 #7 0x00000000012beac5 in Objecter::ms_dispatch (this=0x3b942f0, m=0x7f0ab4a4aeb0) at osdc/Objecter.cc:597 #8 0x00000000015b6afd in Messenger::ms_deliver_dispatch (this=0x3b7a890, m=0x7f0ab4a4aeb0) at msg/Messenger.h:532 #9 0x00000000015b6052 in DispatchQueue::entry (this=0x3b7aa48) at msg/DispatchQueue.cc:185 #10 0x00000000014616e0 in DispatchQueue::DispatchThread::entry (this=0x3b7abd0) at msg/DispatchQueue.h:104 #11 0x000000000146bcdb in Thread::entry_wrapper (this=0x3b7abd0) at common/Thread.cc:60 #12 0x000000000146bc58 in Thread::_entry_func (arg=0x3b7abd0) at common/Thread.cc:45 #13 0x00007f0acc097e9a in start_thread (arg=0x7f0ac2ffd700) at pthread_create.c:308 #14 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #15 0x0000000000000000 in ?? () Thread 154 (Thread 0x7f0abb7fe700 (LWP 10075)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132 #1 0x00007f0acc09a065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f0acc099eba in __pthread_mutex_lock (mutex=0x3b7ad18) at pthread_mutex_lock.c:61 #3 0x0000000001426458 in Mutex::Lock (this=0x3b7ad08, no_lockdep=false) at common/Mutex.cc:91 #4 0x000000000145ff42 in SimpleMessenger::mark_down (this=0x3b7a890, con=0x7f09f1055dd0) at msg/SimpleMessenger.cc:636 #5 0x00000000015ef80a in PipeConnection::mark_down (this=0x7f09f1055dd0) at msg/PipeConnection.cc:84 #6 0x0000000001412664 in MonClient::_reopen_session (this=0x7fff8c55b720, rank=-1, name=...) at mon/MonClient.cc:604 #7 0x0000000000fe9174 in MonClient::_reopen_session (this=0x7fff8c55b720) at ./mon/MonClient.h:190 #8 0x0000000000fe9294 in MonClient::reopen_session (this=0x7fff8c55b720, cb=0x0) at ./mon/MonClient.h:315 #9 0x0000000000fe65f6 in Beacon::is_laggy (this=0x3b93150) at mds/Beacon.cc:216 #10 0x0000000000f9c6e4 in MDS::tick (this=0x3b93010) at mds/MDS.cc:724 #11 0x0000000000fb92b6 in MDS::C_MDS_Tick::finish (this=0x7f0a14157ec0, r=0) at mds/MDS.h:293 #12 0x0000000000fb270f in Context::complete (this=0x7f0a14157ec0, r=0) at ./include/Context.h:64 #13 0x000000000128d855 in MDSInternalContextBase::complete (this=0x7f0a14157ec0, r=0) at mds/MDSContext.cc:30 #14 0x000000000146f278 in SafeTimer::timer_thread (this=0x3b93080) at common/Timer.cc:105 #15 0x0000000001472416 in SafeTimerThread::entry (this=0x3b926a0) at common/Timer.cc:38 #16 0x000000000146bcdb in Thread::entry_wrapper (this=0x3b926a0) at common/Thread.cc:60 #17 0x000000000146bc58 in Thread::_entry_func (arg=0x3b926a0) at common/Thread.cc:45 #18 0x00007f0acc097e9a in start_thread (arg=0x7f0abb7fe700) at pthread_create.c:308 #19 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #20 0x0000000000000000 in ?? () Thread 148 (Thread 0x7f0ab8ff9700 (LWP 10088)): #0 pthread_rwlock_rdlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S:85 #1 0x0000000000fb3c01 in RWLock::get_read (this=0x3b943b0) at ./common/RWLock.h:68 #2 0x00000000012dcde2 in RWLock::RLocker::RLocker (this=0x7f0ab8ff83b0, lock=...) at ./common/RWLock.h:122 #3 0x00000000012c73cb in Objecter::op_submit (this=0x3b942f0, op=0x7f0a9c14d270) at osdc/Objecter.cc:1674 #4 0x00000000012b530d in Objecter::write_trunc (this=0x3b942f0, oid=..., oloc=..., off=1125110, len=952, snapc=..., bl=..., mtime=..., flags=0, trunc_size=0, trunc_seq=0, onack=0x0, oncommit=0x7f0a9c01dae0, objver=0x0, extra_ops=0x0) at ./osdc/Objecter.h:2027 #5 0x00000000012b5bbe in Objecter::sg_write_trunc (this=0x3b942f0, extents=..., snapc=..., bl=..., mtime=..., flags=0, trunc_size=0, trunc_seq=0, onack=0x0, oncommit=0x7f0a9c01dae0) at ./osdc/Objecter.h:2257 #6 0x00000000012b5f52 in Objecter::sg_write (this=0x3b942f0, extents=..., snapc=..., bl=..., mtime=..., flags=0, onack=0x0, oncommit=0x7f0a9c01dae0) at ./osdc/Objecter.h:2280 #7 0x00000000012b60d1 in Filer::write (this=0x7f0aa8002970, ino=..., layout=0x7f0aa8002939, snapc=..., offset=27113106166, len=952, bl=..., mtime=..., flags=0, onack=0x0, oncommit=0x7f0a9c01dae0) at ./osdc/Filer.h:146 #8 0x00000000012ae943 in Journaler::_do_flush (this=0x7f0aa8002830, amount=0) at osdc/Journaler.cc:610 #9 0x00000000012af766 in Journaler::_flush (this=0x7f0aa8002830, onsafe=0x0) at osdc/Journaler.cc:678 #10 0x00000000012af0e3 in Journaler::flush (this=0x7f0aa8002830, onsafe=0x0) at osdc/Journaler.cc:652 #11 0x000000000128fd14 in MDLog::_submit_thread (this=0x3b970f0) at mds/MDLog.cc:370 #12 0x0000000000fb99a6 in MDLog::SubmitThread::entry (this=0x3b972f8) at mds/MDLog.h:143 #13 0x000000000146bcdb in Thread::entry_wrapper (this=0x3b972f8) at common/Thread.cc:60 #14 0x000000000146bc58 in Thread::_entry_func (arg=0x3b972f8) at common/Thread.cc:45 #15 0x00007f0acc097e9a in start_thread (arg=0x7f0ab8ff9700) at pthread_create.c:308 #16 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #17 0x0000000000000000 in ?? () Thread 145 (Thread 0x7f0ac03f4700 (LWP 10091)): #0 pthread_rwlock_rdlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S:85 #1 0x0000000000fb3c01 in RWLock::get_read (this=0x3b943b0) at ./common/RWLock.h:68 #2 0x00000000012dcde2 in RWLock::RLocker::RLocker (this=0x7f0ac03f3280, lock=...) at ./common/RWLock.h:122 #3 0x00000000012ccd6a in Objecter::handle_osd_op_reply (this=0x3b942f0, m=0x7f0a900191e0) at osdc/Objecter.cc:2461 #4 0x00000000012bea1f in Objecter::ms_dispatch (this=0x3b942f0, m=0x7f0a900191e0) at osdc/Objecter.cc:574 #5 0x00000000012ddd88 in Objecter::ms_fast_dispatch (this=0x3b942f0, m=0x7f0a900191e0) at osdc/Objecter.h:1667 #6 0x00000000015b6982 in Messenger::ms_fast_dispatch (this=0x3b7a890, m=0x7f0a900191e0) at msg/Messenger.h:503 #7 0x00000000015b55fc in DispatchQueue::fast_dispatch (this=0x3b7aa48, m=0x7f0a900191e0) at msg/DispatchQueue.cc:71 #8 0x00000000015e479f in Pipe::reader (this=0x7f0aa4002000) at msg/Pipe.cc:1568 #9 0x00000000015eb8ec in Pipe::Reader::entry (this=0x7f0aa4002018) at msg/Pipe.h:50 #10 0x000000000146bcdb in Thread::entry_wrapper (this=0x7f0aa4002018) at common/Thread.cc:60 #11 0x000000000146bc58 in Thread::_entry_func (arg=0x7f0aa4002018) at common/Thread.cc:45 #12 0x00007f0acc097e9a in start_thread (arg=0x7f0ac03f4700) at pthread_create.c:308 #13 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 ---Type <return> to continue, or q <return> to quit--- #14 0x0000000000000000 in ?? () (several of these ...) Thread 101 (Thread 0x7f0a682eb700 (LWP 10187)): #0 pthread_rwlock_rdlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S:85 #1 0x0000000000fb3c01 in RWLock::get_read (this=0x3b943b0) at ./common/RWLock.h:68 #2 0x0000000000fb6338 in Objecter::get_osdmap_read (this=0x3b942f0) at ./osdc/Objecter.h:1622 #3 0x00000000012a4575 in Filer::_do_purge_range (this=0x3b96450, pr=0x7f0a143b9730, fin=1) at osdc/Filer.cc:315 #4 0x00000000012a506f in C_PurgeRange::finish (this=0x7f0a4808b7c0, r=-2) at osdc/Filer.cc:296 #5 0x0000000000fb270f in Context::complete (this=0x7f0a4808b7c0, r=-2) at ./include/Context.h:64 #6 0x00000000012ceb35 in Objecter::handle_osd_op_reply (this=0x3b942f0, m=0x7f0a5405da30) at osdc/Objecter.cc:2643 #7 0x00000000012bea1f in Objecter::ms_dispatch (this=0x3b942f0, m=0x7f0a5405da30) at osdc/Objecter.cc:574 #8 0x00000000012ddd88 in Objecter::ms_fast_dispatch (this=0x3b942f0, m=0x7f0a5405da30) at osdc/Objecter.h:1667 #9 0x00000000015b6982 in Messenger::ms_fast_dispatch (this=0x3b7a890, m=0x7f0a5405da30) at msg/Messenger.h:503 #10 0x00000000015b55fc in DispatchQueue::fast_dispatch (this=0x3b7aa48, m=0x7f0a5405da30) at msg/DispatchQueue.cc:71 #11 0x00000000015e479f in Pipe::reader (this=0x7f0ab2c52010) at msg/Pipe.cc:1568 #12 0x00000000015eb8ec in Pipe::Reader::entry (this=0x7f0ab2c52028) at msg/Pipe.h:50 #13 0x000000000146bcdb in Thread::entry_wrapper (this=0x7f0ab2c52028) at common/Thread.cc:60 #14 0x000000000146bc58 in Thread::_entry_func (arg=0x7f0ab2c52028) at common/Thread.cc:45 #15 0x00007f0acc097e9a in start_thread (arg=0x7f0a682eb700) at pthread_create.c:308 #16 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #17 0x0000000000000000 in ?? () Thread 43 (Thread 0x7f09f7fff700 (LWP 10251)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132 #1 0x00007f0acc09a065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f0acc099eba in __pthread_mutex_lock (mutex=0x3b7ad18) at pthread_mutex_lock.c:61 #3 0x0000000001426458 in Mutex::Lock (this=0x3b7ad08, no_lockdep=false) at common/Mutex.cc:91 #4 0x00000000015e1ab9 in Pipe::fault (this=0x7f0ab2cd1670, onread=true) at msg/Pipe.cc:1301 #5 0x00000000015e3257 in Pipe::reader (this=0x7f0ab2cd1670) at msg/Pipe.cc:1451 #6 0x00000000015eb8ec in Pipe::Reader::entry (this=0x7f0ab2cd1688) at msg/Pipe.h:50 #7 0x000000000146bcdb in Thread::entry_wrapper (this=0x7f0ab2cd1688) at common/Thread.cc:60 #8 0x000000000146bc58 in Thread::_entry_func (arg=0x7f0ab2cd1688) at common/Thread.cc:45 #9 0x00007f0acc097e9a in start_thread (arg=0x7f09f7fff700) at pthread_create.c:308 #10 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112 #11 0x0000000000000000 in ?? ()
pid 10058 on burnupi21
Related issues
Associated revisions
Objecter: disable fast dispatch of CEPH_MSG_OSD_OPREPLY messages
If they fast_dispatch they can deadlock right now. We'll need to change
the semantics around mark_down() or around the Objecter's locking to
fix it.
Sort-of-
Fixes: #9462
Signed-off-by: Greg Farnum <greg@inktank.com>
History
#1 Updated by Sage Weil about 9 years ago
- File all_thread.txt View added
#2 Updated by Sage Weil about 9 years ago
- Priority changed from Urgent to Immediate
thread 160 (not fast dispatch) holds Objecter rwlock, blocks waiting for fast-dispatch to complete
thread 145 (and many others, fast dispatch), trying to take Objecter rwlock
I think the stop_and_wait is fundamentally problematic. It is only safe to block if we know that we aren't holding any locks that fast dispatch for the marked down connection may need.
This happened earlier today, too. Seems to reproduce in a matter of hours on the burnupi21 MDS.
#3 Updated by Sage Weil about 9 years ago
Here is where we added it:
Pipe: wait for Pipes to finish running, instead of just stop()ing them Add a stop_and_wait() function that, in addition to closing the Pipe and killing its socket, waits for any fast_dispatch call which is in-progress. Use this in several parts of the Pipe and SimpleMessenger code where appropriate. This fixes several races with fast_dispatch and other avenues; here are two: 1) It could be that we grab the lock while the existing pipe is fast_dispatching and then proceed to dispatch messages ourself, beating it. Instead, wait for the other pipe. Add a "reader_dispatching" member which tells bus this is happening, and when re-locking, signal the cond if we're shutting down. 2) It could be that a normally-dispatched Message in the OSD triggers a mark_down() on the Connection and then clears out the Session (Connection::priv) pointer, causing a racing fast_dispatch()'ed function to assert out in the OSD because it requires a valid Session. Signed-off-by: Greg Farnum <greg@inktank.com> Reviewed-by: Sage Weil <sage@inktank.com>
Note that the second case is now handled generally; we can ignore it.
I think we need to deal with the first one (and others) elsewhere so we can avoid any blocking during mark_down.
...or, we can just do stop() instead of stop_and_wait() in mark_down() and mark_down_all() ...
#4 Updated by Sage Weil about 9 years ago
- Status changed from New to Fix Under Review
see wip-9462.
tempted to push a patched ceph-mds to burnupi21 and see how it fares there.
#5 Updated by Sage Weil about 9 years ago
- Priority changed from Immediate to Urgent
#6 Updated by Sage Weil about 9 years ago
- Status changed from Fix Under Review to 7
#7 Updated by Greg Farnum almost 9 years ago
- Assignee set to Greg Farnum
Finally got through a suite run and it looks pretty good, but need to check the few failures:
http://pulpito.ceph.com/gregf-2014-09-19_14:58:26-rados:thrash-wip-giant-messenger-fixes-distro-basic-multi/
#8 Updated by Sage Weil almost 9 years ago
- Status changed from 7 to Resolved