Actions
Bug #48293
closednotification: radosgw-admin hangs on while closing
% Done:
0%
Source:
Development
Tags:
notification
Backport:
octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Description
when "data sync run" command is executed on a pubsub zone, the code creates the rabbitmq manager (as part initializing the RGWPSSyncModule class).
when the code exist, the dtor of RGWPSSyncModule is deleting the rabbitmq manager.
this manager join on its worker threads [1], and since the thread is stuck in "sleep" the dtor never finishes:
8 Thread 0x7f93477f6640 (LWP 1006756) "amqp_manager" 0x00007f93de616a31 in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6 9 Thread 0x7f9346ff5640 (LWP 1006757) "kafka_manager" 0x00007f93de616a31 in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6 #0 0x00007f93deca79d7 in __pthread_clockjoin_ex () from /lib64/libpthread.so.0 #1 0x00007f93de80b897 in std::thread::join() () from /lib64/libstdc++.so.6 #2 0x000055f9e0491302 in rgw::amqp::Manager::~Manager (this=0x55f9e0fa0350, __in_chrg=<optimized out>) at /home/cbodley/ceph/src/rgw/rgw_amqp.cc:948 #3 0x000055f9e048bfb2 in rgw::amqp::shutdown () at /home/cbodley/ceph/src/rgw/rgw_amqp.cc:1001 #4 0x000055f9e0072b35 in RGWPSSyncModuleInstance::~RGWPSSyncModuleInstance (this=0x55f9e10ee120, __in_chrg=<optimized out>) at /home/cbodley/ceph/src/rgw/rgw_sync_module_pubsub.cc:1400notes:
- kafka has the same flow, and most likely has the same issue
- not clear if this is happening only in the radosgw-admin or in radosgw as well. in case of radogw-admin there is actually no need to create the amqp (and kafka) managers
- not clear why the "sleep()" command do not return.
[1] https://github.com/ceph/ceph/blob/master/src/rgw/rgw_amqp.cc#L946
Updated by Yuval Lifshitz over 3 years ago
- Status changed from New to Fix Under Review
Updated by Yuval Lifshitz over 3 years ago
- Status changed from Fix Under Review to Pending Backport
- Backport set to octopus
Updated by Backport Bot over 3 years ago
- Copied to Backport #48659: octopus: notification: radosgw-admin hangs on while closing added
Updated by Loïc Dachary over 2 years ago
- Status changed from Pending Backport to Resolved
While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".
Actions