Actions
Bug #15503
closedmsg/async: deadlock in rebind when enabling delay
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Thread 74 (Thread 0x7f2d894c9700 (LWP 755)): #0 0x00007f2d964abf4d in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f2d964a7d02 in _L_lock_791 () from /lib64/libpthread.so.0 #2 0x00007f2d964a7c08 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00007f2d98731a88 in Mutex::Lock (this=this@entry=0x7f2da8b6dae0, no_lockdep=no_lockdep@entry=false) at common/Mutex.cc:110 #4 0x00007f2d9882e89f in stop (this=0x7f2da8b6d800) at msg/async/AsyncConnection.h:390 #5 AsyncMessenger::mark_down_all (this=0x7f2da37ee000) at msg/async/AsyncMessenger.cc:656 #6 0x00007f2d9882e517 in AsyncMessenger::rebind (this=0x7f2da37ee000, avoid_ports=std::set with 3 elements) at msg/async/AsyncMessenger.cc:456 #7 0x00007f2d98118cf7 in OSD::_committed_osd_maps (this=0x7f2da3992000, first=<optimized out>, last=352, m=0x7f2da4fb3680) at osd/OSD.cc:6928 #8 0x00007f2d98128399 in Context::complete (this=0x7f2da87d0a80, r=<optimized out>) at include/Context.h:64 #9 0x00007f2d986ae6e6 in Finisher::finisher_thread_entry (this=0x7f2da37d62c0) at common/Finisher.cc:68 #10 0x00007f2d964a5dc5 in start_thread () from /lib64/libpthread.so.0 #11 0x00007f2d94b3128d in clone () from /lib64/libc.so.6 Thread 84 (Thread 0x7f2d8e4d3700 (LWP 732)): #0 0x00007f2d964abf4d in __lll_lock_wait () from /lib64/libpthread.so.0 #1 0x00007f2d964a7d02 in _L_lock_791 () from /lib64/libpthread.so.0 #2 0x00007f2d964a7c08 in pthread_mutex_lock () from /lib64/libpthread.so.0 #3 0x00007f2d98731a88 in Mutex::Lock (this=this@entry=0x7f2da78ec2e0, no_lockdep=no_lockdep@entry=false) at common/Mutex.cc:110 #4 0x00007f2d988a4e34 in Locker (m=..., this=<synthetic pointer>) at common/Mutex.h:115 #5 AsyncConnection::process (this=0x7f2da78ec000) at msg/async/AsyncConnection.cc:528 #6 0x00007f2d988488e5 in EventCenter::process_events (this=this@entry=0x7f2da37ea7c8, timeout_microseconds=timeout_microseconds@entry=30000000) at msg/async/Event.cc:399 #7 0x00007f2d98828fc0 in Worker::entry (this=0x7f2da37ea780) at msg/async/AsyncMessenger.cc:294 #8 0x00007f2d964a5dc5 in start_thread () from /lib64/libpthread.so.0 #9 0x00007f2d94b3128d in clone () from /lib64/libc.so.6 Thread 86 (Thread 0x7f2d8f4d5700 (LWP 730)): #0 0x00007f2d964a96d5 in pthread_cond_wait@@GLIBC_2.3.2 () from /lib64/libpthread.so.0 #1 0x00007f2d98897587 in Wait (mutex=..., this=0x7f2da3da9398) at common/Cond.h:56 #2 wait_for_flush (this=0x7f2da3da92c0) at msg/async/AsyncConnection.h:177 #3 AsyncConnection::_stop (this=this@entry=0x7f2da8b6d800) at msg/async/AsyncConnection.cc:2272 #4 0x00007f2d9889e24b in AsyncConnection::handle_connect_msg (this=this@entry=0x7f2da8b6d800, connect=..., authorizer_bl=..., authorizer_reply=...) at msg/async/AsyncConnection.cc:1892 #5 0x00007f2d988a082c in AsyncConnection::_process_connection (this=this@entry=0x7f2da8b6d800) at msg/async/AsyncConnection.cc:1511 #6 0x00007f2d988a6810 in AsyncConnection::process (this=0x7f2da8b6d800) at msg/async/AsyncConnection.cc:993 #7 0x00007f2d988488e5 in EventCenter::process_events (this=this@entry=0x7f2da37ea2c8, timeout_microseconds=timeout_microseconds@entry=30000000) at msg/async/Event.cc:399 #8 0x00007f2d98828fc0 in Worker::entry (this=0x7f2da37ea280) at msg/async/AsyncMessenger.cc:294 #9 0x00007f2d964a5dc5 in start_thread () from /lib64/libpthread.so.0 #10 0x00007f2d94b3128d in clone () from /lib64/libc.so.6
/a/sage-2016-04-14_11:23:09-rados-wip-sage-testing---basic-smithi/129588
Updated by Sage Weil about 8 years ago
/a/sage-2016-04-14_11:23:09-rados-wip-sage-testing---basic-smithi/129605
hit it too
Updated by Haomai Wang about 8 years ago
- Subject changed from msg/async: deadlock in rebind to msg/async: deadlock in rebind when enabling delay
- Category set to msgr
- Status changed from New to In Progress
- Assignee set to Haomai Wang
the root cause still is delay dispatch happen in another thread
Updated by Sage Weil almost 8 years ago
- Status changed from In Progress to Rejected
Updated by Greg Farnum about 5 years ago
- Project changed from Ceph to Messengers
- Category deleted (
msgr)
Actions