Project

General

Profile

Actions

Bug #44298

open

RDMADispatcher::enqueue_dead_qp deadlock

Added by Hu Ye about 4 years ago. Updated about 4 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

In RDMADispatcher::handle_async_event, when case IBV_EVENT_QP_LAST_WQE_REACHED matched,
before enqueue_dead_qp has lock, in enqueue_dead_qp also has lock, result in deadlock.

void RDMADispatcher::handle_async_event()
{
  ldout(cct, 30) << __func__ << dendl;
  while (1) {
    ...
    switch (async_event.event_type) {
      ...
      case IBV_EVENT_QP_LAST_WQE_REACHED:
        {
          ...
          std::lock_guard l{lock};
          RDMAConnectedSocketImpl *conn = get_conn_lockless(qpn);
          QueuePair* qp = get_qp_lockless(qpn);
          ...
             if (qp) {
                if (!cct->_conf->ms_async_rdma_cm)
                enqueue_dead_qp(qpn);
             }
          }
        }
        break;
      ...
    }
    ibv_ack_async_event(&async_event);
  }
}

void RDMADispatcher::enqueue_dead_qp(uint32_t qpn)
{
  std::lock_guard l{lock};
  ...
}
Thread 37 (Thread 0xffffad5ea080 (LWP 1038236)):
#0  0x0000ffffb660a618 in __lll_lock_wait (futex=futex@entry=0xaaaaac67dbe8, private=0) at lowlevellock.c:46
#1  0x0000ffffb66037d4 in __GI___pthread_mutex_lock (mutex=0xaaaaac67dbe8) at pthread_mutex_lock.c:78
#2  0x0000ffffb6d6684c in __gthread_mutex_lock (__mutex=0xaaaaac67dbe8) at /usr/include/aarch64-linux-gnu/c++/9/bits/gthr-default.h:749
#3  std::mutex::lock (this=0xaaaaac67dbe8) at /usr/include/c++/9/bits/std_mutex.h:100
#4  std::lock_guard<std::mutex>::lock_guard (__m=..., this=<synthetic pointer>) at /usr/include/c++/9/bits/std_mutex.h:159
#5  RDMADispatcher::enqueue_dead_qp (this=this@entry=0xaaaaac67db90, qpn=qpn@entry=132511) at /root/chunsong/ceph/src/msg/async/rdma/RDMAStack.cc:426
#6  0x0000ffffb6d6b4ec in RDMADispatcher::handle_async_event (this=this@entry=0xaaaaac67db90) at /root/chunsong/ceph/src/msg/async/rdma/RDMAStack.cc:182
#7  0x0000ffffb6d6d75c in RDMADispatcher::polling (this=0xaaaaac67db90) at /root/chunsong/ceph/src/msg/async/rdma/RDMAStack.cc:321
#8  0x0000ffffb64aaed4 in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#9  0x0000ffffb6601088 in start_thread (arg=0xffffaca922cf) at pthread_create.c:463
#10 0x0000ffffb627a4ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78
Actions #1

Updated by Greg Farnum about 4 years ago

  • Project changed from Ceph to Messengers
  • Category deleted (msgr)
Actions

Also available in: Atom PDF