Bug #63528
closedrgw recursive delete deadlock
0%
Description
Recursive bucket delete operations deadlock when using rgw_multi_obj_del_max_aio > 1.
This is most likely caused by other blocking calls that do not use a yield context.
https://github.com/ceph/ceph/blob/5dd24139a1eada541a3bc16b6941c5dde975e26d/src/rgw/rgw_op.cc#L7188
RGWDataChangesLog::add_entry calls seem to hang here: https://github.com/ceph/ceph/blob/5dd24139a1eada541a3bc16b6941c5dde975e26d/src/rgw/driver/rados/rgw_datalog.cc#L723
That being considered, it may be worth changing the rgw_multi_obj_del_max_aio default value to 1, perhaps adding a warning as well. Addressing all the blocking RGW operations may take a while.
Trace:
Thread 577 (Thread 0x7f48fbe99640 (LWP 415192) "radosgw"): #0 __futex_abstimed_wait_common64 (private=0, cancel=true, abstime=0x0, op=393, expected=0, futex_word=0x55f0fa598078) at ./nptl/futex-internal.c:57 #1 __futex_abstimed_wait_common (cancel=true, private=0, abstime=0x0, clockid=0, expected=0, futex_word=0x55f0fa598078) at ./nptl/futex-internal.c:87 #2 __GI___futex_abstimed_wait_cancelable64 (futex_word=futex_word@entry=0x55f0fa598078, expected=expected@entry=0, clockid=clockid@entry=0, abstime=abstime@entry=0x0, private=private@entry=0) at ./nptl/futex-internal.c:139 #3 0x00007f4a1e318a41 in __pthread_cond_wait_common (abstime=0x0, clockid=0, mutex=0x55f0fa598028, cond=0x55f0fa598050) at ./nptl/pthread_cond_wait.c:503 #4 ___pthread_cond_wait (cond=0x55f0fa598050, mutex=0x55f0fa598028) at ./nptl/pthread_cond_wait.c:627 #5 0x000055f0f3ab95fd in ceph::common::RefCountedCond::wait (this=this@entry=0x55f0fa598000) at /mnt/data/workspace/ceph_linux_reef/src/common/RefCountedObj.h:120 #6 0x000055f0f3aaca9c in RGWDataChangesLog::add_entry (this=0x55f0f67c1400, dpp=dpp@entry=0x55f0f9c1e900, bucket_info=..., gen=..., shard_id=0, y=...) at /mnt/data/workspace/ceph_linux_reef/src/rgw/driver/rados/rgw_datalog.cc:721 #7 0x000055f0f37e9715 in add_datalog_entry (dpp=0x55f0f9c1e900, datalog=<optimized out>, bucket_info=..., shard_id=<optimized out>, y=...) at /mnt/data/workspace/ceph_linux_reef/src/rgw/driver/rados/rgw_rados.cc:722 #8 0x000055f0f37fc968 in RGWRados::Bucket::UpdateIndex::complete_del (this=0x55f0fb09f0f0, dpp=0x55f0f9c1e900, poolid=16, epoch=0, removed_mtime=..., remove_objs=0x0, y=...) at /mnt/data/workspace/ceph_linux_reef/src/rgw/driver/rados/rgw_rados.cc:6374 #9 0x000055f0f381af21 in RGWRados::Object::Delete::delete_obj (this=this@entry=0x55f0fb1e0a80, y=..., dpp=dpp@entry=0x55f0f9c1e900) at /mnt/data/workspace/ceph_linux_reef/src/rgw/driver/rados/rgw_rados.cc:5347 #10 0x000055f0f38711b2 in rgw::sal::RadosObject::RadosDeleteOp::delete_obj (this=0x55f0fb1e0000, dpp=0x55f0f9c1e900, y=...) at /mnt/data/workspace/ceph_linux_reef/src/rgw/driver/rados/rgw_sal_rados.cc:2273 #11 0x000055f0f35affec in RGWDeleteMultiObj::handle_individual_object (this=0x55f0f9c1e900, o=..., y=..., formatter_flush_cond=0x7f48e6cea520) at /mnt/data/workspace/ceph_linux_reef/src/rgw/rgw_op.cc:7095 #12 0x000055f0f35b0c51 in operator() (yield=..., __closure=0x55f0fb00a5e8) at /mnt/data/workspace/ceph_linux_reef/src/common/async/yield_context.h:40 #13 operator() (c=..., __closure=<optimized out>) at /mnt/data/workspace/ceph_linux_reef/src/spawn/include/spawn/impl/spawn.hpp:390 #14 std::__invoke_impl<boost::context::continuation, spawn::detail::spawn_helper<boost::asio::executor_binder<void (*)(), boost::asio::strand<boost::asio::io_context::basic_executor_type<std::allocator<void>, 0> > >, RGWDeleteMultiObj::execute(optional_yield)::<lambda(yield_context)>, boost::context::basic_fixedsize_stack<boost::context::stack_traits> >::operator()()::<lambda(boost::context::continuation&&)>&, boost::context::continuation> (__f=...) at /usr/include/c++/11/bits/invoke.h:61 #15 std::__invoke<spawn::detail::spawn_helper<boost::asio::executor_binder<void (*)(), boost::asio::strand<boost::asio::io_context::basic_executor_type<std::allocator<void>, 0> > >, RGWDeleteMultiObj::execute(optional_yield)::<lambda(yield_context)>, boost::context::basic_fixedsize_stack<boost::context::stack_traits> >::operator()()::<lambda(boost::context::continuation&&)>&, boost::context::continuation> (__fn=...) at /usr/include/c++/11/bits/invoke.h:97 #16 std::invoke<spawn::detail::spawn_helper<boost::asio::executor_binder<void (*)(), boost::asio::strand<boost::asio::io_context::basic_executor_type<std::allocator<void>, 0> > >, RGWDeleteMultiObj::execute(optional_yield)::<lambda(yield_context)>, boost::context::basic_fixedsize_stack<boost::context::stack_traits> >::operator()()::<lambda(boost::context::continuation&&)>&, boost::context::continuation> (__fn=...) at /usr/include/c++/11/functional:98 #17 boost::context::detail::record<boost::context::continuation, boost::context::basic_fixedsize_stack<boost::context::stack_traits>, spawn::detail::spawn_helper<boost::asio::executor_binder<void (*)(), boost::asio::strand<boost::asio::io_context::basic_executor_type<std::allocator<void>, 0> > >, RGWDeleteMultiObj::execute(optional_yield)::<lambda(yield_context)>, boost::context::basic_fixedsize_stack<boost::context::stack_traits> >::operator()()::<lambda(boost::context::continuation&&)> >::run (fctx=<optimized out>, this=<optimized out>) at /mnt/data/workspace/ceph_linux_reef/build/boost/include/boost/context/continuation_fcontext.hpp:143 #18 boost::context::detail::context_entry<boost::context::detail::record<boost::context::continuation, boost::context::basic_fixedsize_stack<boost::context::stack_traits>, spawn::detail::spawn_helper<boost::asio::executor_binder<void (*)(), boost::asio::strand<boost::asio::io_context::basic_executor_type<std::allocator<void>, 0> > >, RGWDeleteMultiObj::execute(optional_yield)::<lambda(yield_context)>, boost::context::basic_fixedsize_stack<boost::context::stack_traits> >::operator()()::<lambda(boost::context::continuation&&)> > >(boost::context::detail::transfer_t) (t=...) at /mnt/data/workspace/ceph_linux_reef/build/boost/include/boost/context/continuation_fcontext.hpp:80 #19 0x000055f0f409867f in make_fcontext () #20 0x0000000000000000 in ?? ()
Updated by Casey Bodley 6 months ago
- Is duplicate of Bug #63373: multisite: Deadlock in RGWDeleteMultiObj with default rgw_multi_obj_del_max_aio > 1 added
Updated by Enrico Bocchi 6 months ago
Do you see this happening in a multi-site RGW setup or also with simpler single-zone clusters?
We are seeing the same problem on Reef 18.2.0 in multisite: one zonegroup, two zones. Interestingly, RGW seems to deadlock on deletions even when all other RGWs are shut off.
I was unable to reproduce on Reef with no multisite configuration.
Updated by Lucian Petrut 6 months ago
Enrico Bocchi wrote:
Do you see this happening in a multi-site RGW setup or also with simpler single-zone clusters?
We are seeing the same problem on Reef 18.2.0 in multisite: one zonegroup, two zones. Interestingly, RGW seems to deadlock on deletions even when all other RGWs are shut off.
I was unable to reproduce on Reef with no multisite configuration.
We have the same setup, couldn't reproduce the issue without multi-site. I also got it by deleting a bucket on the sync destination, at which point the sync rgw service started flooding the primary rgw zone.