Actions
Bug #9898
closedosd: fast dispatch deadlock in mark_down (giant)
Status:
Resolved
Priority:
Urgent
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
this is basically a dup of the issue we saw with fast dispach in the objecter, but with the osd.
Thread 12 (Thread 0x7f6c90fc5700 (LWP 31158)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007f6c976c2657 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f6c976c2480 in __GI___pthread_mutex_lock (mutex=0x4186b88) at ../nptl/pthread_mutex_lock.c:79 #3 0x0000000000b32cef in Mutex::Lock (this=this@entry=0x4186b78, no_lockdep=no_lockdep@entry=false) at common/Mutex.cc:91 #4 0x0000000000b54e81 in SimpleMessenger::mark_down (this=0x4186700, con=0x67cfde0) at msg/SimpleMessenger.cc:636 #5 0x0000000000669f39 in OSD::require_same_peer_instance (this=this@entry=0x4818000, op=..., map=..., is_fast_dispatch=is_fast_dispatch@entry=true) at osd/OSD.cc:6764 #6 0x00000000006e0f15 in OSD::handle_replica_op<MOSDPGPull, 106> (this=this@entry=0x4818000, op=..., osdmap=...) at osd/OSD.cc:8160 #7 0x000000000069ae1e in OSD::dispatch_op_fast (this=this@entry=0x4818000, op=..., osdmap=...) at osd/OSD.cc:5758 #8 0x000000000069afb8 in OSD::dispatch_session_waiting (this=this@entry=0x4818000, session=session@entry=0x6866800, osdmap=...) at osd/OSD.cc:5402 #9 0x000000000069b39e in OSD::ms_fast_dispatch (this=0x4818000, m=<optimized out>) at osd/OSD.cc:5512 #10 0x0000000000c21db6 in ms_fast_dispatch (m=0x513b600, this=0x4186700) at msg/Messenger.h:503 #11 DispatchQueue::fast_dispatch (this=0x41868b8, m=0x513b600) at msg/DispatchQueue.cc:71 #12 0x0000000000c46836 in Pipe::reader (this=0x5024c00) at msg/Pipe.cc:1591 #13 0x0000000000c4f4ad in Pipe::Reader::entry (this=<optimized out>) at msg/Pipe.h:50 #14 0x00007f6c976c0182 in start_thread (arg=0x7f6c90fc5700) at pthread_create.c:312 #15 0x00007f6c95c2c38d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
vs
Thread 59 (Thread 0x7f6c84cba700 (LWP 29734)): #0 pthread_cond_wait@@GLIBC_2.3.2 () at ../nptl/sysdeps/unix/sysv/linux/x86_64/pthread_cond_wait.S:185 #1 0x0000000000c347e5 in Wait (mutex=..., this=0x5024e18) at ./common/Cond.h:55 #2 Pipe::stop_and_wait (this=this@entry=0x5024c00) at msg/Pipe.cc:1437 #3 0x0000000000b54f08 in SimpleMessenger::mark_down (this=0x4186700, con=<optimized out>) at msg/SimpleMessenger.cc:643 #4 0x0000000000669f39 in OSD::require_same_peer_instance (this=this@entry=0x4818000, op=..., map=..., is_fast_dispatch=is_fast_dispatch@entry=false) at osd/OSD.cc:6764 #5 0x000000000067829e in OSD::require_same_or_newer_map (this=this@entry=0x4818000, op=..., epoch=207, is_fast_dispatch=is_fast_dispatch@entry=false) at osd/OSD.cc:6808 #6 0x00000000006a0ef7 in OSD::handle_pg_log (this=0x4818000, op=...) at osd/OSD.cc:7347 #7 0x00000000006a3678 in OSD::dispatch_op (this=this@entry=0x4818000, op=...) at osd/OSD.cc:5696 #8 0x00000000006a8ae8 in OSD::_dispatch (this=this@entry=0x4818000, m=m@entry=0x4652300) at osd/OSD.cc:5843 #9 0x00000000006a91a7 in OSD::ms_dispatch (this=0x4818000, m=0x4652300) at osd/OSD.cc:5386 #10 0x0000000000c22d69 in ms_deliver_dispatch (m=0x4652300, this=0x4186700) at msg/Messenger.h:532 #11 DispatchQueue::entry (this=0x41868b8) at msg/DispatchQueue.cc:185 #12 0x0000000000b5f0bd in DispatchQueue::DispatchThread::entry (this=<optimized out>) at msg/DispatchQueue.h:104 #13 0x00007f6c976c0182 in start_thread (arg=0x7f6c84cba700) at pthread_create.c:312 #14 0x00007f6c95c2c38d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
also
Thread 56 (Thread 0x7f6c834b7700 (LWP 29737)): #0 __lll_lock_wait () at ../nptl/sysdeps/unix/sysv/linux/x86_64/lowlevellock.S:135 #1 0x00007f6c976c2657 in _L_lock_909 () from /lib/x86_64-linux-gnu/libpthread.so.0 #2 0x00007f6c976c2480 in __GI___pthread_mutex_lock (mutex=0x4186b88) at ../nptl/pthread_mutex_lock.c:79 #3 0x0000000000b32cef in Mutex::Lock (this=this@entry=0x4186b78, no_lockdep=no_lockdep@entry=false) at common/Mutex.cc:91 #4 0x0000000000b5a64b in Locker (m=..., this=<synthetic pointer>) at ./common/Mutex.h:115 #5 SimpleMessenger::get_connection (this=0x4186700, dest=...) at msg/SimpleMessenger.cc:385 #6 0x00000000006628e2 in OSDService::get_con_osd_cluster (this=this@entry=0x4819710, peer=1, from_epoch=<optimized out>) at osd/OSD.cc:700 #7 0x00000000006884bd in OSD::handle_osd_ping (this=this@entry=0x4818000, m=m@entry=0x62fcee0) at osd/OSD.cc:3763 #8 0x0000000000689aab in OSD::heartbeat_dispatch (this=0x4818000, m=0x62fcee0) at osd/OSD.cc:5344 #9 0x0000000000c22d69 in ms_deliver_dispatch (m=0x62fcee0, this=0x4188300) at msg/Messenger.h:532 #10 DispatchQueue::entry (this=0x41884b8) at msg/DispatchQueue.cc:185 #11 0x0000000000b5f0bd in DispatchQueue::DispatchThread::entry (this=<optimized out>) at msg/DispatchQueue.h:104 #12 0x00007f6c976c0182 in start_thread (arg=0x7f6c834b7700) at pthread_create.c:312 #13 0x00007f6c95c2c38d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:111
full thread dump attached
Files
Actions