Project

General

Profile

Actions

Bug #37910

closed

segv during crc of incoming message front

Added by Sage Weil over 5 years ago. Updated about 5 years ago.

Status:
Resolved
Priority:
Urgent
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

  -207> 2019-01-14 00:29:22.245 7f6ea2dad700 20 --2- [v2:172.21.15.62:6802/33804,v1:172.21.15.62:6803/33804] >> [v2:172.21.15.26:6828/34392,v1:172.21.15.26:6829/34392] conn(0x561533806800 0x5615360f7600 :-1 s=READ_MESSAGE_FRONT pgs=165 cs=23 l=0).handle_message_front r=0
  -206> 2019-01-14 00:29:22.245 7f6ea2dad700 20 --2- [v2:172.21.15.62:6802/33804,v1:172.21.15.62:6803/33804] >> [v2:172.21.15.26:6828/34392,v1:172.21.15.26:6829/34392] conn(0x561533806800 0x5615360f7600 :-1 s=READ_MESSAGE_FRONT pgs=165 cs=23 l=0).handle_message_front got front 659621
  -205> 2019-01-14 00:29:22.245 7f6ea2dad700 20 --2- [v2:172.21.15.62:6802/33804,v1:172.21.15.62:6803/33804] >> [v2:172.21.15.26:6828/34392,v1:172.21.15.26:6829/34392] conn(0x561533806800 0x5615360f7600 :-1 s=READ_MESSAGE_FRONT pgs=165 cs=23 l=0).read_message_middle
  -204> 2019-01-14 00:29:22.245 7f6ea2dad700 20 --2- [v2:172.21.15.62:6802/33804,v1:172.21.15.62:6803/33804] >> [v2:172.21.15.26:6828/34392,v1:172.21.15.26:6829/34392] conn(0x561533806800 0x5615360f7600 :-1 s=READ_MESSAGE_FRONT pgs=165 cs=23 l=0).read_message_data_prepare
  -203> 2019-01-14 00:29:22.245 7f6ea2dad700 20 --2- [v2:172.21.15.62:6802/33804,v1:172.21.15.62:6803/33804] >> [v2:172.21.15.26:6828/34392,v1:172.21.15.26:6829/34392] conn(0x561533806800 0x5615360f7600 :-1 s=READ_MESSAGE_FRONT pgs=165 cs=23 l=0).read_message_data msg_left=0
  -202> 2019-01-14 00:29:22.245 7f6ea2dad700 20 --2- [v2:172.21.15.62:6802/33804,v1:172.21.15.62:6803/33804] >> [v2:172.21.15.26:6828/34392,v1:172.21.15.26:6829/34392] conn(0x561533806800 0x5615360f7600 :-1 s=READ_MESSAGE_FRONT pgs=165 cs=23 l=0).read_message_footer
  -201> 2019-01-14 00:29:22.245 7f6ea2dad700 20 -- [v2:172.21.15.62:6802/33804,v1:172.21.15.62:6803/33804] >> [v2:172.21.15.26:6828/34392,v1:172.21.15.26:6829/34392] conn(0x561533806800 msgr2=0x5615360f7600 :-1 s=STATE_CONNECTION_ESTABLISHED l=0).read start len=21
  -200> 2019-01-14 00:29:22.245 7f6ea2dad700 20 --2- [v2:172.21.15.62:6802/33804,v1:172.21.15.62:6803/33804] >> [v2:172.21.15.26:6828/34392,v1:172.21.15.26:6829/34392] conn(0x561533806800 0x5615360f7600 :-1 s=READ_FOOTER_AND_DISPATCH pgs=165 cs=23 l=0).handle_message_footer r=0
  -199> 2019-01-14 00:29:22.245 7f6ea2dad700 10 --2- [v2:172.21.15.62:6802/33804,v1:172.21.15.62:6803/33804] >> [v2:172.21.15.26:6828/34392,v1:172.21.15.26:6829/34392] conn(0x561533806800 0x5615360f7600 :-1 s=READ_FOOTER_AND_DISPATCH pgs=165 cs=23 l=0).handle_message_footer aborted = 0
  -198> 2019-01-14 00:29:22.245 7f6ea2dad700 20 --2- [v2:172.21.15.62:6802/33804,v1:172.21.15.62:6803/33804] >> [v2:172.21.15.26:6828/34392,v1:172.21.15.26:6829/34392] conn(0x561533806800 0x5615360f7600 :-1 s=READ_FOOTER_AND_DISPATCH pgs=165 cs=23 l=0).handle_message_footer got 659621 + 0 + 0 byte message
    -8> 2019-01-14 00:29:22.250 7f6ea2dad700 -1 *** Caught signal (Segmentation fault) **
 in thread 7f6ea2dad700 thread_name:msgr-worker-2

 ceph version 14.0.1-2510-g74ba84a (74ba84a6b0855849aee1d2fd3678e94a657542ce) nautilus (dev)
 1: (()+0xf6d0) [0x7f6ea887d6d0]
 2: (ceph::buffer::list::crc32c(unsigned int) const+0x6b) [0x561528ae4c3b]
 3: (decode_message(CephContext*, int, ceph_msg_header&, ceph_msg_footer&, ceph::buffer::list&, ceph::buffer::list&, ceph::buffer::list&, Connection*)+0x2f2) [0x561528a54a42]
 4: (ProtocolV2::handle_message_footer(char*, int)+0xeb) [0x561528c704cb]
 5: (()+0xf15bdd) [0x561528c68bdd]
 6: (AsyncConnection::process()+0x18c) [0x561528c4b2ec]
 7: (EventCenter::process_events(unsigned int, std::chrono::duration<unsigned long, std::ratio<1l, 1000000000l> >*)+0x1585) [0x561528ac44d5]

(gdb) bt
#0  0x00007f6ea887d59b in raise () from /lib64/libpthread.so.0
#1  0x00005615288b4f65 in reraise_fatal (signum=11) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/global/signal_handler.cc:81
#2  handle_fatal_signal (signum=11) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/global/signal_handler.cc:298
#3  <signal handler called>
#4  0x0000561528ae4c3b in test_and_set (__m=std::memory_order_acquire, this=0x50) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/atomic_base.h:176
#5  spin_lock (lock=...) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/include/spinlock.h:50
#6  lock (this=0x50) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/include/spinlock.h:39
#7  lock_guard (__m=..., this=<synthetic pointer>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/std_mutex.h:162
#8  get_crc (crc=<synthetic pointer>, fromto=<synthetic pointer>, this=0x0) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/include/buffer_raw.h:102
#9  ceph::buffer::list::crc32c (this=this@entry=0x5615360f7848, crc=crc@entry=0) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/common/buffer.cc:1924
#10 0x0000561528a54a42 in decode_message (cct=0x561532bde000, crcflags=3, header=..., footer=..., front=..., middle=..., data=..., conn=0x561533806800) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/msg/Message.cc:297
#11 0x0000561528c704cb in ProtocolV2::handle_message_footer (this=0x5615360f7600, buffer=<optimized out>, r=<optimized out>) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/msg/async/ProtocolV2.cc:915
#12 0x0000561528c68bdd in operator() (r=<optimized out>, buffer=<optimized out>, __closure=0x561533806bb8) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/msg/async/ProtocolV2.cc:430
#13 std::_Function_handler<void(char*, long int), ProtocolV2::read(CtFun<ProtocolV2, char*, int>*, int, char*)::<lambda(char*, int)> >::_M_invoke(const std::_Any_data &, <unknown type in /usr/lib/debug/usr/bin/ceph-osd.debug, CU 0xc547871, DIE 0xc6208cd>, <unknown type in /usr/lib/debug/usr/bin/ceph-osd.debug, CU 0xc547871, DIE 0xc6208de>) (__functor=..., __args#0=<optimized out>, __args#1=<optimized out>) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/std_function.h:316
#14 0x0000561528c4b2ec in operator() (__args#1=<optimized out>, __args#0=<optimized out>, this=0x561533806bb8) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/std_function.h:706
#15 AsyncConnection::process (this=0x561533806800) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/msg/async/AsyncConnection.cc:429
#16 0x0000561528ac44d5 in EventCenter::process_events (this=this@entry=0x5615336c1080, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, working_dur=working_dur@entry=0x7f6ea2daab60) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/msg/async/Event.cc:415
#17 0x0000561528ac9c37 in operator() (__closure=0x5615336ae518) at /usr/src/debug/ceph-14.0.1-2510-g74ba84a/src/msg/async/Stack.cc:53
#18 std::_Function_handler<void(), NetworkStack::add_thread(unsigned int)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /opt/rh/devtoolset-7/root/usr/include/c++/7/bits/std_function.h:316
#19 0x0000561528fcf42f in execute_native_thread_routine ()
#20 0x00007f6ea8875e25 in start_thread () from /lib64/libpthread.so.0
#21 0x00007f6ea773ebad in clone () from /lib64/libc.so.6

/a/sage-2019-01-13_22:11:18-rados-wip-sage-testing-2019-01-13-0915-distro-basic-smithi/3458928

Related issues 1 (0 open1 closed)

Related to RADOS - Bug #38023: segv on FileJournal::prepare_entry in bufferlistClosedRadoslaw Zarzynski01/23/2019

Actions
Actions

Also available in: Atom PDF