Actions
Bug #44298
openRDMADispatcher::enqueue_dead_qp deadlock
Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
In RDMADispatcher::handle_async_event, when case IBV_EVENT_QP_LAST_WQE_REACHED matched,
before enqueue_dead_qp has lock, in enqueue_dead_qp also has lock, result in deadlock.
void RDMADispatcher::handle_async_event()
{
ldout(cct, 30) << __func__ << dendl;
while (1) {
...
switch (async_event.event_type) {
...
case IBV_EVENT_QP_LAST_WQE_REACHED:
{
...
std::lock_guard l{lock};
RDMAConnectedSocketImpl *conn = get_conn_lockless(qpn);
QueuePair* qp = get_qp_lockless(qpn);
...
if (qp) {
if (!cct->_conf->ms_async_rdma_cm)
enqueue_dead_qp(qpn);
}
}
}
break;
...
}
ibv_ack_async_event(&async_event);
}
}
void RDMADispatcher::enqueue_dead_qp(uint32_t qpn)
{
std::lock_guard l{lock};
...
}
Thread 37 (Thread 0xffffad5ea080 (LWP 1038236)):
#0 0x0000ffffb660a618 in __lll_lock_wait (futex=futex@entry=0xaaaaac67dbe8, private=0) at lowlevellock.c:46
#1 0x0000ffffb66037d4 in __GI___pthread_mutex_lock (mutex=0xaaaaac67dbe8) at pthread_mutex_lock.c:78
#2 0x0000ffffb6d6684c in __gthread_mutex_lock (__mutex=0xaaaaac67dbe8) at /usr/include/aarch64-linux-gnu/c++/9/bits/gthr-default.h:749
#3 std::mutex::lock (this=0xaaaaac67dbe8) at /usr/include/c++/9/bits/std_mutex.h:100
#4 std::lock_guard<std::mutex>::lock_guard (__m=..., this=<synthetic pointer>) at /usr/include/c++/9/bits/std_mutex.h:159
#5 RDMADispatcher::enqueue_dead_qp (this=this@entry=0xaaaaac67db90, qpn=qpn@entry=132511) at /root/chunsong/ceph/src/msg/async/rdma/RDMAStack.cc:426
#6 0x0000ffffb6d6b4ec in RDMADispatcher::handle_async_event (this=this@entry=0xaaaaac67db90) at /root/chunsong/ceph/src/msg/async/rdma/RDMAStack.cc:182
#7 0x0000ffffb6d6d75c in RDMADispatcher::polling (this=0xaaaaac67db90) at /root/chunsong/ceph/src/msg/async/rdma/RDMAStack.cc:321
#8 0x0000ffffb64aaed4 in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#9 0x0000ffffb6601088 in start_thread (arg=0xffffaca922cf) at pthread_create.c:463
#10 0x0000ffffb627a4ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
Updated by Greg Farnum about 4 years ago
- Project changed from Ceph to Messengers
- Category deleted (
msgr)
Actions