Project

General

Profile

Actions

Bug #48293

closed

notification: radosgw-admin hangs on while closing

Added by Yuval Lifshitz over 3 years ago. Updated over 2 years ago.

Status:
Resolved
Priority:
Normal
Target version:
-
% Done:

0%

Source:
Development
Tags:
notification
Backport:
octopus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

when "data sync run" command is executed on a pubsub zone, the code creates the rabbitmq manager (as part initializing the RGWPSSyncModule class).
when the code exist, the dtor of RGWPSSyncModule is deleting the rabbitmq manager.
this manager join on its worker threads [1], and since the thread is stuck in "sleep" the dtor never finishes:

8    Thread 0x7f93477f6640 (LWP 1006756) "amqp_manager"    0x00007f93de616a31 in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6
9    Thread 0x7f9346ff5640 (LWP 1006757) "kafka_manager"   0x00007f93de616a31 in clock_nanosleep@GLIBC_2.2.5 () from /lib64/libc.so.6
#0  0x00007f93deca79d7 in __pthread_clockjoin_ex () from /lib64/libpthread.so.0
#1  0x00007f93de80b897 in std::thread::join() () from /lib64/libstdc++.so.6
#2  0x000055f9e0491302 in rgw::amqp::Manager::~Manager (this=0x55f9e0fa0350, __in_chrg=<optimized out>) at /home/cbodley/ceph/src/rgw/rgw_amqp.cc:948
#3  0x000055f9e048bfb2 in rgw::amqp::shutdown () at /home/cbodley/ceph/src/rgw/rgw_amqp.cc:1001
#4  0x000055f9e0072b35 in RGWPSSyncModuleInstance::~RGWPSSyncModuleInstance (this=0x55f9e10ee120, __in_chrg=<optimized out>)
at /home/cbodley/ceph/src/rgw/rgw_sync_module_pubsub.cc:1400

notes:
  • kafka has the same flow, and most likely has the same issue
  • not clear if this is happening only in the radosgw-admin or in radosgw as well. in case of radogw-admin there is actually no need to create the amqp (and kafka) managers
  • not clear why the "sleep()" command do not return.

[1] https://github.com/ceph/ceph/blob/master/src/rgw/rgw_amqp.cc#L946


Related issues 1 (0 open1 closed)

Copied to rgw - Backport #48659: octopus: notification: radosgw-admin hangs on while closingResolvedsinguliere _Actions
Actions #1

Updated by Yuval Lifshitz over 3 years ago

  • Assignee set to Yuval Lifshitz
Actions #2

Updated by Yuval Lifshitz over 3 years ago

  • Pull request ID set to 38190
Actions #3

Updated by Yuval Lifshitz over 3 years ago

  • Status changed from New to Fix Under Review
Actions #4

Updated by Yuval Lifshitz over 3 years ago

  • Status changed from Fix Under Review to Pending Backport
  • Backport set to octopus
Actions #5

Updated by Backport Bot over 3 years ago

  • Copied to Backport #48659: octopus: notification: radosgw-admin hangs on while closing added
Actions #6

Updated by Loïc Dachary over 2 years ago

  • Status changed from Pending Backport to Resolved

While running with --resolve-parent, the script "backport-create-issue" noticed that all backports of this issue are in status "Resolved" or "Rejected".

Actions

Also available in: Atom PDF