Actions
Bug #43739
closedradosgw abort caused by beast frontend coroutine stack overflow
% Done:
0%
Source:
Support
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
Reproducer flow:
compile radosgw Debug build
./do_cmake.sh -DCMAKE_BUILD_TYPE=Debug
block the rados tcp port:
sudo iptables -I INPUT -i lo -m multiport -p tcp --dports 6800:7300 -j DROP
generate large object PUT load:
for i in {1..800}; do s3cmd --access_key=b2345678901234567890 --secret_key=b234567890123456789012345678901234567890 --signature-v2 put ./ubuntu-13.04-server-amd64.iso s3://bkt & done
wait for all s3cmd instances to start and unblock the rados connection:
sudo iptables -D INPUT 1
radosgw log will indicate rados tcp disconnection and radosgw will abort,
logs excerpt:
2020-01-09T17:11:28.716+0200 7f70336b6700 -1 rgw realm watcher: RGWRealmWatcher::handle_error oid=realms.996e28b9-ae01-440d-befd-a082a1623f70.control err=-107 2020-01-09T17:11:29.115+0200 7f753b7fe700 -1 RGWWatcher::handle_error cookie 58961024 err (107) Transport endpoint is not connected 2020-01-09T17:11:29.116+0200 7f753b7fe700 -1 RGWWatcher::handle_error cookie 58965312 err (107) Transport endpoint is not connected 2020-01-09T17:11:29.118+0200 7f753b7fe700 -1 RGWWatcher::handle_error cookie 58953360 err (107) Transport endpoint is not connected 2020-01-09T17:11:29.119+0200 7f753b7fe700 -1 RGWWatcher::handle_error cookie 58960224 err (107) Transport endpoint is not connected 2020-01-09T17:11:29.121+0200 7f753b7fe700 -1 RGWWatcher::handle_error cookie 140131361122240 err (107) Transport endpoint is not connected 2020-01-09T17:11:29.123+0200 7f753b7fe700 -1 RGWWatcher::handle_error cookie 58952224 err (107) Transport endpoint is not connected 2020-01-09T17:11:29.125+0200 7f753b7fe700 -1 RGWWatcher::handle_error cookie 140128542709104 err (107) Transport endpoint is not connected 2020-01-09T17:11:29.126+0200 7f753b7fe700 -1 RGWWatcher::handle_error cookie 140128543179216 err (107) Transport endpoint is not connected free(): invalid size *** Caught signal (Aborted) ** in thread 7f70e67fc700 thread_name:radosgw ceph version 14.0.0-18987-g2ca2221ddb (2ca2221ddbd600c0a0213b81a95c30a6c4f2163d) octopus (dev) 1: ./bin/radosgw() [0x121998f] 2: (()+0x14b20) [0x7f75702c8b20] 3: (gsignal()+0x145) [0x7f756fa55625] 4: (abort()+0x12b) [0x7f756fa3e8d9] 5: (()+0x804af) [0x7f756fa994af] 6: (()+0x87a9c) [0x7f756faa0a9c] 7: (()+0x894ac) [0x7f756faa24ac] 8: (boost::coroutines::basic_standard_stack_allocator<boost::coroutines::stack_traits>::deallocate(boost::coroutines::stack_context&)+0x11f) [0x11b021f] 9: ./bin/radosgw() [0x102a655] 10: ./bin/radosgw() [0x1022208] 11: (boost::coroutines::push_coroutine<void>::~push_coroutine()+0x5c) [0x11b088c] 12: (std::_Sp_counted_ptr<boost::coroutines::push_coroutine<void>*, (__gnu_cxx::_Lock_policy)2>::_M_dispose()+0x53) [0x11b0a43] 13: (std::_Sp_counted_base<(__gnu_cxx::_Lock_policy)2>::_M_release()+0x74) [0xf2b924] 14: (std::__shared_count<(__gnu_cxx::_Lock_policy)2>::~__shared_count()+0x52) [0xf2b872] 15: (std::__shared_ptr<boost::coroutines::push_coroutine<void>, (__gnu_cxx::_Lock_policy)2>::~__shared_ptr()+0x4f) [0x10933af] 16: (std::shared_ptr<boost::coroutines::push_coroutine<void> >::~shared_ptr()+0x48) [0x106db78] 17: (boost::asio::detail::coro_handler<boost::asio::executor_binder<void (*)(), boost::asio::executor>, unsigned long>::~coro_handler()+0x5f) [0x1081b9f] 18: (boost::beast::async_base<boost::asio::detail::coro_handler<boost::asio::executor_binder<void (*)(), boost::asio::executor>, unsigned long>, boost::asio::executor, std::allocator<void> > 19: (boost::beast::detail::dynamic_read_ops::read_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::executor>, boost::beast::flat_static_buffer<65536ul>, boost::beast::> 20: (boost::asio::detail::binder2<boost::beast::detail::dynamic_read_ops::read_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::executor>, boost::beast::flat_static_bu> 21: (boost::asio::detail::executor_function<boost::asio::detail::binder2<boost::beast::detail::dynamic_read_ops::read_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::> 22: (boost::asio::detail::executor_function_base::complete()+0x50) [0x105fda0] 23: (boost::asio::executor::function::operator()()+0x61) [0x105fd11] 24: (void boost::asio::asio_handler_invoke<boost::asio::executor::function>(boost::asio::executor::function&, ...)+0x48) [0x105fbf8] 25: (void boost_asio_handler_invoke_helpers::invoke<boost::asio::executor::function, boost::asio::executor::function>(boost::asio::executor::function&, boost::asio::executor::function&)+0x7> 26: (boost::asio::detail::executor_op<boost::asio::executor::function, std::allocator<void>, boost::asio::detail::scheduler_operation>::do_complete(void*, boost::asio::detail::scheduler_ope> 27: (boost::asio::detail::scheduler_operation::complete(void*, boost::system::error_code const&, unsigned long)+0x85) [0x103e1d5] 28: (boost::asio::detail::strand_executor_service::invoker<boost::asio::io_context::executor_type const>::operator()()+0xd8) [0x118f6e8] 29: (void boost::asio::asio_handler_invoke<boost::asio::detail::strand_executor_service::invoker<boost::asio::io_context::executor_type const> >(boost::asio::detail::strand_executor_service 30: (void boost_asio_handler_invoke_helpers::invoke<boost::asio::detail::strand_executor_service::invoker<boost::asio::io_context::executor_type const>, boost::asio::detail::strand_executor> 31: (void boost::asio::io_context::executor_type::dispatch<boost::asio::detail::strand_executor_service::invoker<boost::asio::io_context::executor_type const>, std::allocator<void> >(boost:> 32: (void boost::asio::detail::strand_executor_service::dispatch<boost::asio::io_context::executor_type const, boost::asio::executor::function, std::allocator<void> >(std::shared_ptr<boost:> 33: (void boost::asio::strand<boost::asio::io_context::executor_type>::dispatch<boost::asio::executor::function, std::allocator<void> >(boost::asio::executor::function&&, std::allocator<voi> 34: (boost::asio::executor::impl<boost::asio::strand<boost::asio::io_context::executor_type>, std::allocator<void> >::dispatch(boost::asio::executor::function&&)+0x6e) [0x118df1e] 35: (void boost::asio::executor::dispatch<boost::asio::detail::binder2<boost::beast::detail::dynamic_read_ops::read_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::ex> 36: (void boost::asio::detail::handler_work<boost::beast::detail::dynamic_read_ops::read_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::asio::executor>, boost::beast::flat> 37: (boost::asio::detail::reactive_socket_recv_op<boost::asio::mutable_buffer, boost::beast::detail::dynamic_read_ops::read_op<boost::asio::basic_stream_socket<boost::asio::ip::tcp, boost::> 38: (boost::asio::detail::scheduler_operation::complete(void*, boost::system::error_code const&, unsigned long)+0x85) [0x103e1d5] 39: (boost::asio::detail::scheduler::do_run_one(boost::asio::detail::conditionally_enabled_mutex::scoped_lock&, boost::asio::detail::scheduler_thread_info&, boost::system::error_code const&> 40: (boost::asio::detail::scheduler::run(boost::system::error_code&)+0x130) [0x103c910] 41: (boost::asio::io_context::run(boost::system::error_code&)+0x5c) [0x11d825c] 42: ./bin/radosgw() [0x102cdd6] 43: ./bin/radosgw() [0x102cd30] 44: ./bin/radosgw() [0x102cba0] 45: ./bin/radosgw() [0x102cb18] 46: ./bin/radosgw() [0x102ca88] 47: ./bin/radosgw() [0x102c72f] 48: (()+0xd76f4) [0x7f756fe1b6f4] 49: (()+0x94e2) [0x7f75702bd4e2] 50: (clone()+0x43) [0x7f756fb1a693] 2020-01-09T17:47:19.451+0200 7f70e67fc700 -1 *** Caught signal (Aborted) ** in thread 7f70e67fc700 thread_name:radosgw
Updated by Ken Dreyer about 4 years ago
- Status changed from Resolved to Pending Backport
- Backport set to nautilus
Updated by Nathan Cutler about 4 years ago
- Copied to Backport #43921: nautilus: radosgw abort caused by beast frontend coroutine stack overflow added
Updated by Casey Bodley over 3 years ago
- Has duplicate Bug #47910: radosgw crash on objecter operations added
Updated by Loïc Dachary about 3 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Actions