Actions
Bug #49963
closedCrash in OSD::ms_fast_dispatch due to call to null vtable function
% Done:
0%
Source:
Tags:
Backport:
Regression:
Yes
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
OSD
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
/a/sage-2021-03-24_06:13:24-upgrade:octopus-x-wip-sage-testing-2021-03-23-2309-distro-basic-smithi/5993446
#0 raise (sig=<optimized out>) at ../sysdeps/unix/sysv/linux/raise.c:51 #1 0x000055e42b5b53ca in reraise_fatal (signum=11) at ./src/global/signal_handler.cc:87 #2 handle_fatal_signal (signum=11) at ./src/global/signal_handler.cc:332 #3 <signal handler called> #4 0x0000000000000000 in ?? () #5 0x000055e42af9f587 in OSD::ms_fast_dispatch (this=0x55e42fefa000, m=0x55e43026cb00) at ./src/osd/OSD.cc:7186 #6 0x000055e42b954323 in Dispatcher::ms_fast_dispatch2 (m=..., this=0x55e42fefa000) at ./src/msg/Dispatcher.h:84 #7 Messenger::ms_fast_dispatch (m=..., this=<optimized out>) at ./src/msg/Messenger.h:685 #8 DispatchQueue::fast_dispatch (this=0x55e42fd72c28, m=...) at ./src/msg/DispatchQueue.cc:74 #9 0x000055e42b98758b in DispatchQueue::fast_dispatch (m=0x55e43026cb00, this=<optimized out>) at ./src/msg/DispatchQueue.h:203 #10 ProtocolV2::handle_message (this=this@entry=0x55e43257c500) at ./src/msg/async/ProtocolV2.cc:1482 #11 0x000055e42b998d90 in ProtocolV2::handle_read_frame_dispatch (this=this@entry=0x55e43257c500) at ./src/msg/async/ProtocolV2.cc:1140 #12 0x000055e42b998ee9 in ProtocolV2::_handle_read_frame_epilogue_main (this=this@entry=0x55e43257c500) at ./src/msg/async/ProtocolV2.cc:1328 #13 0x000055e42b99a77e in ProtocolV2::handle_read_frame_epilogue_main (this=0x55e43257c500, buffer=..., r=0) at ./src/msg/async/ProtocolV2.cc:1303 #14 0x000055e42b982a74 in ProtocolV2::run_continuation (this=0x55e43257c500, continuation=...) at ./src/msg/async/ProtocolV2.cc:47 #15 0x000055e42b95bda8 in std::function<void (char*, long)>::operator()(char*, long) const (__args#1=<optimized out>, __args#0=<optimized out>, this=0x55e4324457a0) at /usr/include/c++/7/bits/std_function.h:706 #16 AsyncConnection::process (this=0x55e432445400) at ./src/msg/async/AsyncConnection.cc:454 #17 0x000055e42b7a29ed in EventCenter::process_events (this=this@entry=0x55e42f0cf4c0, timeout_microseconds=<optimized out>, timeout_microseconds@entry=30000000, working_dur=working_dur@entry=0x7f40f17de668) at ./src/msg/async/Event.cc:422 #18 0x000055e42b7a7b80 in NetworkStack::<lambda()>::operator() (__closure=0x55e42f1a3238) at ./src/msg/async/Stack.cc:52 #19 std::_Function_handler<void(), NetworkStack::add_thread(Worker*)::<lambda()> >::_M_invoke(const std::_Any_data &) (__functor=...) at /usr/include/c++/7/bits/std_function.h:316 #20 0x00007f40f53536df in ?? () from /usr/lib/x86_64-linux-gnu/libstdc++.so.6 #21 0x00007f40f5c706db in start_thread (arg=0x7f40f17e1700) at pthread_create.c:463 #22 0x00007f40f4a1071f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95
#4 0x0000000000000000 in ?? ()
That's a telltale that we called a null function.
7185 // note sender epoch, min req's epoch 7186 op->sent_epoch = static_cast<MOSDFastDispatchOp*>(m)->get_map_epoch(); 0x000055e42af9f57e <+670>: mov (%rbx),%rax 0x000055e42af9f581 <+673>: mov %rbx,%rdi 0x000055e42af9f584 <+676>: callq *0x48(%rax) <--------------- HERE ./obj-x86_64-linux-gnu/boost/include/boost/smart_ptr/intrusive_ptr.hpp: 199 ./obj-x86_64-linux-gnu/boost/include/boost/smart_ptr/intrusive_ptr.hpp: No such file or directory. => 0x000055e42af9f587 <+679>: mov -0xa8(%rbp),%rdx 0x000055e42af9f58e <+686>: test %rdx,%rdx 0x000055e42af9f591 <+689>: je 0x55e42af9f93d <OSD::ms_fast_dispatch(Message*)+1629>
The issue is happening in the instruction before the instruction pointer at +676 which is common in my experience.
(gdb) info reg rax rax 0x55e42c879cf8 94438487989496 (gdb) p/x 0x55e42c879cf8+0x48 $10 = 0x55e42c879d40 (gdb) x/a 0x55e42c879d40 0x55e42c879d40 <_ZTV12MOSDPGCreate>: 0x0
$ c++filt _ZTV12MOSDPGCreate vtable for MOSDPGCreate
So at least that section of the vtable for the message, which was an MOSDPGCreate, was zeroed out.
It's curious that $rax hints at MOSDPGInfo.
(gdb) x/a $rax 0x55e42c879cf8 <_ZTV10MOSDPGInfo+16>: 0x55e42b747710 <MOSDPGInfo::~MOSDPGInfo()>
Actions