Project

General

Profile

Bug #9462

msgr deadlock: osd reply vs mark_down vs fault

Added by Sage Weil about 5 years ago. Updated almost 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
Start date:
09/13/2014
Due date:
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:

Description

Thread 160 (Thread 0x7f0ac2ffd700 (LWP 10067)):
#0  pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:162
#1  0x0000000000fb28b4 in Cond::Wait (this=0x7f0ab2c6a228, mutex=...) at ./common/Cond.h:55
#2  0x00000000015e2d11 in Pipe::stop_and_wait (this=0x7f0ab2c6a010) at msg/Pipe.cc:1414
#3  0x00000000014600dd in SimpleMessenger::mark_down (this=0x3b7a890, con=0x7f0ab2c4cdf0) at msg/SimpleMessenger.cc:643
#4  0x00000000015ef80a in PipeConnection::mark_down (this=0x7f0ab2c4cdf0) at msg/PipeConnection.cc:84
#5  0x00000000012c41c8 in Objecter::close_session (this=0x3b942f0, s=0x7f0ab2c689b0) at osdc/Objecter.cc:1288
#6  0x00000000012c050b in Objecter::handle_osd_map (this=0x3b942f0, m=0x7f0ab4a4aeb0) at osdc/Objecter.cc:792
#7  0x00000000012beac5 in Objecter::ms_dispatch (this=0x3b942f0, m=0x7f0ab4a4aeb0) at osdc/Objecter.cc:597
#8  0x00000000015b6afd in Messenger::ms_deliver_dispatch (this=0x3b7a890, m=0x7f0ab4a4aeb0) at msg/Messenger.h:532
#9  0x00000000015b6052 in DispatchQueue::entry (this=0x3b7aa48) at msg/DispatchQueue.cc:185
#10 0x00000000014616e0 in DispatchQueue::DispatchThread::entry (this=0x3b7abd0) at msg/DispatchQueue.h:104
#11 0x000000000146bcdb in Thread::entry_wrapper (this=0x3b7abd0) at common/Thread.cc:60
#12 0x000000000146bc58 in Thread::_entry_func (arg=0x3b7abd0) at common/Thread.cc:45
#13 0x00007f0acc097e9a in start_thread (arg=0x7f0ac2ffd700) at pthread_create.c:308
#14 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#15 0x0000000000000000 in ?? ()

Thread 154 (Thread 0x7f0abb7fe700 (LWP 10075)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
#1  0x00007f0acc09a065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f0acc099eba in __pthread_mutex_lock (mutex=0x3b7ad18) at pthread_mutex_lock.c:61
#3  0x0000000001426458 in Mutex::Lock (this=0x3b7ad08, no_lockdep=false) at common/Mutex.cc:91
#4  0x000000000145ff42 in SimpleMessenger::mark_down (this=0x3b7a890, con=0x7f09f1055dd0) at msg/SimpleMessenger.cc:636
#5  0x00000000015ef80a in PipeConnection::mark_down (this=0x7f09f1055dd0) at msg/PipeConnection.cc:84
#6  0x0000000001412664 in MonClient::_reopen_session (this=0x7fff8c55b720, rank=-1, name=...) at mon/MonClient.cc:604
#7  0x0000000000fe9174 in MonClient::_reopen_session (this=0x7fff8c55b720) at ./mon/MonClient.h:190
#8  0x0000000000fe9294 in MonClient::reopen_session (this=0x7fff8c55b720, cb=0x0) at ./mon/MonClient.h:315
#9  0x0000000000fe65f6 in Beacon::is_laggy (this=0x3b93150) at mds/Beacon.cc:216
#10 0x0000000000f9c6e4 in MDS::tick (this=0x3b93010) at mds/MDS.cc:724
#11 0x0000000000fb92b6 in MDS::C_MDS_Tick::finish (this=0x7f0a14157ec0, r=0) at mds/MDS.h:293
#12 0x0000000000fb270f in Context::complete (this=0x7f0a14157ec0, r=0) at ./include/Context.h:64
#13 0x000000000128d855 in MDSInternalContextBase::complete (this=0x7f0a14157ec0, r=0) at mds/MDSContext.cc:30
#14 0x000000000146f278 in SafeTimer::timer_thread (this=0x3b93080) at common/Timer.cc:105
#15 0x0000000001472416 in SafeTimerThread::entry (this=0x3b926a0) at common/Timer.cc:38
#16 0x000000000146bcdb in Thread::entry_wrapper (this=0x3b926a0) at common/Thread.cc:60
#17 0x000000000146bc58 in Thread::_entry_func (arg=0x3b926a0) at common/Thread.cc:45
#18 0x00007f0acc097e9a in start_thread (arg=0x7f0abb7fe700) at pthread_create.c:308
#19 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#20 0x0000000000000000 in ?? ()

Thread 148 (Thread 0x7f0ab8ff9700 (LWP 10088)):
#0  pthread_rwlock_rdlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S:85
#1  0x0000000000fb3c01 in RWLock::get_read (this=0x3b943b0) at ./common/RWLock.h:68
#2  0x00000000012dcde2 in RWLock::RLocker::RLocker (this=0x7f0ab8ff83b0, lock=...) at ./common/RWLock.h:122
#3  0x00000000012c73cb in Objecter::op_submit (this=0x3b942f0, op=0x7f0a9c14d270) at osdc/Objecter.cc:1674
#4  0x00000000012b530d in Objecter::write_trunc (this=0x3b942f0, oid=..., oloc=..., off=1125110, len=952, snapc=..., bl=..., mtime=..., flags=0, trunc_size=0, trunc_seq=0, onack=0x0, oncommit=0x7f0a9c01dae0, objver=0x0, extra_ops=0x0) at ./osdc/Objecter.h:2027
#5  0x00000000012b5bbe in Objecter::sg_write_trunc (this=0x3b942f0, extents=..., snapc=..., bl=..., mtime=..., flags=0, trunc_size=0, trunc_seq=0, onack=0x0, oncommit=0x7f0a9c01dae0) at ./osdc/Objecter.h:2257
#6  0x00000000012b5f52 in Objecter::sg_write (this=0x3b942f0, extents=..., snapc=..., bl=..., mtime=..., flags=0, onack=0x0, oncommit=0x7f0a9c01dae0) at ./osdc/Objecter.h:2280
#7  0x00000000012b60d1 in Filer::write (this=0x7f0aa8002970, ino=..., layout=0x7f0aa8002939, snapc=..., offset=27113106166, len=952, bl=..., mtime=..., flags=0, onack=0x0, oncommit=0x7f0a9c01dae0) at ./osdc/Filer.h:146
#8  0x00000000012ae943 in Journaler::_do_flush (this=0x7f0aa8002830, amount=0) at osdc/Journaler.cc:610
#9  0x00000000012af766 in Journaler::_flush (this=0x7f0aa8002830, onsafe=0x0) at osdc/Journaler.cc:678
#10 0x00000000012af0e3 in Journaler::flush (this=0x7f0aa8002830, onsafe=0x0) at osdc/Journaler.cc:652
#11 0x000000000128fd14 in MDLog::_submit_thread (this=0x3b970f0) at mds/MDLog.cc:370
#12 0x0000000000fb99a6 in MDLog::SubmitThread::entry (this=0x3b972f8) at mds/MDLog.h:143
#13 0x000000000146bcdb in Thread::entry_wrapper (this=0x3b972f8) at common/Thread.cc:60
#14 0x000000000146bc58 in Thread::_entry_func (arg=0x3b972f8) at common/Thread.cc:45
#15 0x00007f0acc097e9a in start_thread (arg=0x7f0ab8ff9700) at pthread_create.c:308
#16 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#17 0x0000000000000000 in ?? ()

Thread 145 (Thread 0x7f0ac03f4700 (LWP 10091)):
#0  pthread_rwlock_rdlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S:85
#1  0x0000000000fb3c01 in RWLock::get_read (this=0x3b943b0) at ./common/RWLock.h:68
#2  0x00000000012dcde2 in RWLock::RLocker::RLocker (this=0x7f0ac03f3280, lock=...) at ./common/RWLock.h:122
#3  0x00000000012ccd6a in Objecter::handle_osd_op_reply (this=0x3b942f0, m=0x7f0a900191e0) at osdc/Objecter.cc:2461
#4  0x00000000012bea1f in Objecter::ms_dispatch (this=0x3b942f0, m=0x7f0a900191e0) at osdc/Objecter.cc:574
#5  0x00000000012ddd88 in Objecter::ms_fast_dispatch (this=0x3b942f0, m=0x7f0a900191e0) at osdc/Objecter.h:1667
#6  0x00000000015b6982 in Messenger::ms_fast_dispatch (this=0x3b7a890, m=0x7f0a900191e0) at msg/Messenger.h:503
#7  0x00000000015b55fc in DispatchQueue::fast_dispatch (this=0x3b7aa48, m=0x7f0a900191e0) at msg/DispatchQueue.cc:71
#8  0x00000000015e479f in Pipe::reader (this=0x7f0aa4002000) at msg/Pipe.cc:1568
#9  0x00000000015eb8ec in Pipe::Reader::entry (this=0x7f0aa4002018) at msg/Pipe.h:50
#10 0x000000000146bcdb in Thread::entry_wrapper (this=0x7f0aa4002018) at common/Thread.cc:60
#11 0x000000000146bc58 in Thread::_entry_func (arg=0x7f0aa4002018) at common/Thread.cc:45
#12 0x00007f0acc097e9a in start_thread (arg=0x7f0ac03f4700) at pthread_create.c:308
#13 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
---Type <return> to continue, or q <return> to quit---
#14 0x0000000000000000 in ?? ()
(several of these ...)

Thread 101 (Thread 0x7f0a682eb700 (LWP 10187)):
#0  pthread_rwlock_rdlock () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_rwlock_rdlock.S:85
#1  0x0000000000fb3c01 in RWLock::get_read (this=0x3b943b0) at ./common/RWLock.h:68

#2  0x0000000000fb6338 in Objecter::get_osdmap_read (this=0x3b942f0) at ./osdc/Objecter.h:1622
#3  0x00000000012a4575 in Filer::_do_purge_range (this=0x3b96450, pr=0x7f0a143b9730, fin=1) at osdc/Filer.cc:315
#4  0x00000000012a506f in C_PurgeRange::finish (this=0x7f0a4808b7c0, r=-2) at osdc/Filer.cc:296
#5  0x0000000000fb270f in Context::complete (this=0x7f0a4808b7c0, r=-2) at ./include/Context.h:64
#6  0x00000000012ceb35 in Objecter::handle_osd_op_reply (this=0x3b942f0, m=0x7f0a5405da30) at osdc/Objecter.cc:2643
#7  0x00000000012bea1f in Objecter::ms_dispatch (this=0x3b942f0, m=0x7f0a5405da30) at osdc/Objecter.cc:574
#8  0x00000000012ddd88 in Objecter::ms_fast_dispatch (this=0x3b942f0, m=0x7f0a5405da30) at osdc/Objecter.h:1667
#9  0x00000000015b6982 in Messenger::ms_fast_dispatch (this=0x3b7a890, m=0x7f0a5405da30) at msg/Messenger.h:503
#10 0x00000000015b55fc in DispatchQueue::fast_dispatch (this=0x3b7aa48, m=0x7f0a5405da30) at msg/DispatchQueue.cc:71
#11 0x00000000015e479f in Pipe::reader (this=0x7f0ab2c52010) at msg/Pipe.cc:1568
#12 0x00000000015eb8ec in Pipe::Reader::entry (this=0x7f0ab2c52028) at msg/Pipe.h:50
#13 0x000000000146bcdb in Thread::entry_wrapper (this=0x7f0ab2c52028) at common/Thread.cc:60
#14 0x000000000146bc58 in Thread::_entry_func (arg=0x7f0ab2c52028) at common/Thread.cc:45
#15 0x00007f0acc097e9a in start_thread (arg=0x7f0a682eb700) at pthread_create.c:308
#16 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#17 0x0000000000000000 in ?? ()

Thread 43 (Thread 0x7f09f7fff700 (LWP 10251)):
#0  __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:132
#1  0x00007f0acc09a065 in _L_lock_858 () from /lib/x86_64-linux-gnu/libpthread.so.0
#2  0x00007f0acc099eba in __pthread_mutex_lock (mutex=0x3b7ad18) at pthread_mutex_lock.c:61
#3  0x0000000001426458 in Mutex::Lock (this=0x3b7ad08, no_lockdep=false) at common/Mutex.cc:91
#4  0x00000000015e1ab9 in Pipe::fault (this=0x7f0ab2cd1670, onread=true) at msg/Pipe.cc:1301
#5  0x00000000015e3257 in Pipe::reader (this=0x7f0ab2cd1670) at msg/Pipe.cc:1451
#6  0x00000000015eb8ec in Pipe::Reader::entry (this=0x7f0ab2cd1688) at msg/Pipe.h:50
#7  0x000000000146bcdb in Thread::entry_wrapper (this=0x7f0ab2cd1688) at common/Thread.cc:60
#8  0x000000000146bc58 in Thread::_entry_func (arg=0x7f0ab2cd1688) at common/Thread.cc:45
#9  0x00007f0acc097e9a in start_thread (arg=0x7f09f7fff700) at pthread_create.c:308
#10 0x00007f0acac4c3fd in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:112
#11 0x0000000000000000 in ?? ()


pid 10058 on burnupi21

all_thread.txt View (177 KB) Sage Weil, 09/13/2014 10:13 PM


Related issues

Related to Ceph - Bug #9295: osd/OSD.cc: 5501: FAILED assert(session) in ms_fast_dispatch Resolved 08/29/2014

Associated revisions

Revision 3f23709c (diff)
Added by Greg Farnum almost 5 years ago

Objecter: disable fast dispatch of CEPH_MSG_OSD_OPREPLY messages

If they fast_dispatch they can deadlock right now. We'll need to change
the semantics around mark_down() or around the Objecter's locking to
fix it.

Sort-of-
Fixes: #9462

Signed-off-by: Greg Farnum <>

History

#1 Updated by Sage Weil about 5 years ago

#2 Updated by Sage Weil about 5 years ago

  • Priority changed from Urgent to Immediate

thread 160 (not fast dispatch) holds Objecter rwlock, blocks waiting for fast-dispatch to complete
thread 145 (and many others, fast dispatch), trying to take Objecter rwlock

I think the stop_and_wait is fundamentally problematic. It is only safe to block if we know that we aren't holding any locks that fast dispatch for the marked down connection may need.

This happened earlier today, too. Seems to reproduce in a matter of hours on the burnupi21 MDS.

#3 Updated by Sage Weil about 5 years ago

Here is where we added it:

    Pipe: wait for Pipes to finish running, instead of just stop()ing them

    Add a stop_and_wait() function that, in addition to closing the Pipe and killing
    its socket, waits for any fast_dispatch call which is in-progress. Use this in
    several parts of the Pipe and SimpleMessenger code where appropriate.

    This fixes several races with fast_dispatch and other avenues; here are two:
    1) It could be that we grab the lock while the existing pipe is fast_dispatching
    and then proceed to dispatch messages ourself, beating it. Instead, wait for
    the other pipe. Add a "reader_dispatching" member which tells bus this is
    happening, and when re-locking, signal the cond if we're shutting down.

    2) It could be that a normally-dispatched Message in the OSD triggers a
    mark_down() on the Connection and then clears out the Session
    (Connection::priv) pointer, causing a racing fast_dispatch()'ed function to
    assert out in the OSD because it requires a valid Session.

    Signed-off-by: Greg Farnum <greg@inktank.com>
    Reviewed-by: Sage Weil <sage@inktank.com>

Note that the second case is now handled generally; we can ignore it.

I think we need to deal with the first one (and others) elsewhere so we can avoid any blocking during mark_down.

...or, we can just do stop() instead of stop_and_wait() in mark_down() and mark_down_all() ...

#4 Updated by Sage Weil about 5 years ago

  • Status changed from New to Need Review

see wip-9462.

tempted to push a patched ceph-mds to burnupi21 and see how it fares there.

#5 Updated by Sage Weil about 5 years ago

  • Priority changed from Immediate to Urgent

#6 Updated by Sage Weil almost 5 years ago

  • Status changed from Need Review to Testing

#7 Updated by Greg Farnum almost 5 years ago

  • Assignee set to Greg Farnum

Finally got through a suite run and it looks pretty good, but need to check the few failures:
http://pulpito.ceph.com/gregf-2014-09-19_14:58:26-rados:thrash-wip-giant-messenger-fixes-distro-basic-multi/

#8 Updated by Sage Weil almost 5 years ago

  • Status changed from Testing to Resolved

Also available in: Atom PDF