Project

General

Profile

Bug #44346

Segmentation fault in rdmastack send

Added by chunsong feng 7 months ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature:

Description

Using rdma protocol stack, segmentation fault occurs during packet sending.

sing host libthread_db library "/lib/aarch64-linux-gnu/libthread_db.so.1".
Core was generated by `/usr/bin/ceph-osd -f --cluster ceph --id 37 --setuser ceph --setgroup ceph'.
Program terminated with signal SIGSEGV, Segmentation fault.
#0 raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:51
51 ../sysdeps/unix/sysv/linux/raise.c: No such file or directory.
[Current thread is 1 (Thread 0xffffa10bf9f0 (LWP 377856))]
(gdb) bt
#0 raise (sig=sig@entry=11) at ../sysdeps/unix/sysv/linux/raise.c:51
#1 0x0000aaaabc1bba1c in reraise_fatal (signum=11) at ./src/global/signal_handler.cc:326
#2 handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:326
#3 <signal handler called>
#4 _memcpy_generic () at ../sysdeps/aarch64/multiarch/../memcpy.S:105
#5 0x0000aaaabc582d00 in memcpy (
_len=52, _src=0xaaab4e327000, __dest=<optimized out>) at /usr/include/aarch64-linux-gnu/bits/string_fortified.h:34
#6 Infiniband::MemoryManager::Chunk::write (this=this@entry=0xaaab5560f6c0, buf=buf@entry=0xaaab4e327000 "\002", <incomplete sequence \302>, len=<optimized out>)
at ./src/msg/async/rdma/Infiniband.cc:738
#7 0x0000aaaabc591018 in RDMAConnectedSocketImpl::tx_copy_chunk (this=this@entry=0xaaab53527d40, tx_buffers=std::vector of length 1, capacity 1 = {...},
req_copy_len=req_copy_len@entry=243, start=..., end=...) at ./src/include/buffer.h:317
#8 0x0000aaaabc592aec in RDMAConnectedSocketImpl::submit (this=this@entry=0xaaab53527d40, more=more@entry=false)
at ./src/msg/async/rdma/RDMAConnectedSocketImpl.cc:380
#9 0x0000aaaabc594c34 in RDMAConnectedSocketImpl::send (this=0xaaab53527d40, bl=..., more=<optimized out>) at ./src/msg/async/rdma/RDMAConnectedSocketImpl.cc:297
#10 0x0000aaaabc5304d4 in ConnectedSocket::send (more=false, bl=..., this=0xaaab575d8ee8) at /usr/include/c++/9/bits/unique_ptr.h:357
#11 AsyncConnection::_try_send (this=this@entry=0xaaab575d8d00, more=more@entry=false) at ./src/msg/async/AsyncConnection.cc:330
#12 0x0000aaaabc530ec4 in AsyncConnection::write(ceph::buffer::v14_2_0::list&, std::function<void (long)>, bool) (this=this@entry=0xaaab575d8d00, bl=...,
callback=..., more=more@entry=false) at ./src/msg/async/AsyncConnection.cc:309
#13 0x0000aaaabc55eee0 in ProtocolV2::write (this=this@entry=0xaaab5350cc00, desc="auth request", next=..., buffer=...) at /usr/include/c++/9/bits/std_function.h:87
#14 0x0000aaaabc5679b0 in ProtocolV2::write<ceph::msgr::v2::AuthRequestFrame> (frame=..., next=..., desc="auth request", this=0xaaab5350cc00)
at ./src/msg/async/ProtocolV2.cc:760
#15 ProtocolV2::send_auth_request (this=this@entry=0xaaab5350cc00, allowed_methods=std::vector of length 0, capacity 0) at ./src/msg/async/ProtocolV2.cc:1763
#16 0x0000aaaabc567ea8 in ProtocolV2::send_auth_request (this=0xaaab5350cc00) at ./src/msg/async/ProtocolV2.h:215
#17 ProtocolV2::post_client_banner_exchange (this=0xaaab5350cc00) at ./src/msg/async/ProtocolV2.cc:1731
#18 0x0000aaaabc55e770 in ProtocolV2::run_continuation (this=0xaaab5350cc00, continuation=...) at ./src/msg/async/ProtocolV2.cc:45
#19 0x0000aaaabc532f40 in std::function<void (char*, long)>::operator()(char*, long) const (
_args#1=0, _args#0=0xaaab534aa460 "ceph v2\n\020",
this=0xaaab575d9110) at /usr/include/c++/9/bits/std_function.h:685
#20 AsyncConnection::process (this=0xaaab575d8d00) at ./src/msg/async/AsyncConnection.cc:457
#21 0x0000aaaabc383edc in EventCenter::process_events (this=this@entry=0xaaaafe0ee608, timeout_microseconds=<optimized out>,
working_dur=working_dur@entry=0xffffa10bf160) at /usr/include/c++/9/bits/basic_ios.h:282
#22 0x0000aaaabc38a0f0 in NetworkStack::<lambda()>::operator() (
_closure=0xaaaafe15e488, _closure=0xaaaafe15e488) at ./src/msg/async/Stack.cc:53
#23 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (
_functor=...)
at /usr/include/c++/9/bits/std_function.h:300
#24 0x0000ffffa1d42ed4 in ?? () from /usr/lib/aarch64-linux-gnu/libstdc++.so.6
#25 0x0000ffffa1ef8088 in start_thread (arg=0xffffd84b1edf) at pthread_create.c:463
#26 0x0000ffffa1b124ec in thread_start () at ../sysdeps/unix/sysv/linux/aarch64/clone.S:78

Add assertion in RDMADispatcher::handle_tx_event to find the root cause,and get coredump ceph_rdma_bug.txt
The wr_id is QueuePair,and QueuePair+offsetof(chunk->buffer) is random, if it satisfies is_tx_buffer, so it is wrongly free as chunk resources.

invalid_chunk.txt View (117 KB) chunsong feng, 02/28/2020 07:41 AM

ceph_rdma_bug.txt View - add asset in handle_tx_event (119 KB) chunsong feng, 02/28/2020 07:42 AM

History

#1 Updated by chunsong feng 7 months ago

the offsetof(chunk->buffer)'s address map to the Infiniband :: QueuePair address, The upper four bytes are random values and the lower four bytes are ib_physical_port. This may fall into the tx buffer address range and cause misjudgment.
(gdb) p &((Chunk )0).buffer
$13 = (char *
) 0x20

(gdb) p this
$14 = (Infiniband::QueuePair * const) 0x55cf4d576820
(gdb) p ((Infiniband::QueuePair )0).Quithysical_port
(gdb) p this
$15 = (Infiniband::QueuePair * const) 0x55cf4d576820
(gdb) p &this->ib_physical_port
$16 = (int *) 0x55cf4d576840
(gdb) p &this->pd
$17 = (ibv_pd *
) 0x55cf4d576848
(gdb) p sizeof(this->ib_physical_port)
$18 = 4

#2 Updated by Josh Durgin 5 months ago

  • Project changed from bluestore to Messengers

#3 Updated by chunsong feng about 1 month ago

msg/async/rdma: Retry upon modify_qp failure during link setup

When multiple clients initiate tests concurrently, thousands of
connections are established at the same time. The modify_qp interface
needs to be invoked for multiple times for each connection. During the
pressure test, the link setup fails due to modify_qp for several times
in one night, and the OSD exits. The reason why modify_qp fails is
that too many concurrency causes the NIC to process the command timeout.
Retrying can improve the success rate and avoid restarting the entire osd
due to a link failure.

Signed-off-by: Chunsong Feng <>

commit c94c3bcd466f233332d31c674f2801462d8ec19d
Author: Chunsong Feng <>
Date: Fri Aug 21 08:11:47 2020 +0800

msg/async/rdma: Hash the RoCE interrupts to different cores
The CQ interrupts are hashed to different cores to avoid the
single-core bottleneck. Use a script to bind the queue interrupt to the
NUMA node where the OSD is located to reduce the latency.
Signed-off-by: Chunsong Feng &lt;&gt;

#4 Updated by chunsong feng about 1 month ago

When multiple clients initiate tests concurrently, thousands of
connections are established at the same time. The modify_qp interface
needs to be invoked for multiple times for each connection. During the
pressure test, the link setup fails due to modify_qp for several times
in one night, and the OSD exits. The reason why modify_qp fails is
that too many concurrency causes the NIC to process the command timeout.
Retrying can improve the success rate and avoid restarting the entire osd
due to a link failure.

Also available in: Atom PDF