Project

General

Profile

Actions

Bug #42981

open

run-backend-api-tests.sh: mgr oneshot signal handlers do not revert to killing process

Added by Sage Weil over 4 years ago. Updated about 4 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
ceph-mgr
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On master, when running the dashboard tests, the test tries to kill the mgr. The first SIGINT triggers shutdown of the python modules, but that invariable seems to hang due to some deadlock. Subsequent kill signals fail to kill the process.

To reproduce,

cd ../src/pybind/mgr/dashboard
./run-backend-api-tests.sh

and it will eventually time out trying to kill ceph-mgr.

The mgr log will note that the first signal was received,

2019-11-22T11:44:12.259-0600 7fef96fa4700 -1 received  signal: Terminated from python ../qa/tasks/vstart_runner.py --ignore-missing-binaries tasks.mgr.test_dashboard tasks.mgr.dashboard.test_auth tasks.mgr.dashboard.test_cephfs tasks.mgr.dashboard.test_cluster_configuration tasks.mgr.dashboard.test_erasure_code_pro
file tasks.mgr.dashboard.test_ganesha tasks.mgr.dashboard.test_health tasks.mgr.dashboard.test_host tasks.mgr.dashboard.test_logs tasks.mgr.dashboard.test_mgr_module tasks.mgr.dashboard.test_monitor tasks.mgr.dashboard.test_orchestrator tasks.mgr.dashboard.test_osd tasks.mgr.dashboard.test_perf_counters tasks.mgr.d
ashboard.test_pool tasks.mgr.dashboard.test_rbd_mirroring tasks.mgr.dashboard.test_rbd tasks.mgr.dashboard.test_requests tasks.mgr.dashboard.test_rgw tasks.mgr.dashboard.test_role tasks.mgr.dashboard.test_settings tasks.mgr.dashboard.test_summary tasks.mgr.dashboard.test_user tasks.mgr.test_module_selftest  (PID: 1
56456) UID: 1031
2019-11-22T11:44:12.259-0600 7fef96fa4700 -1 mgr handle_mgr_signal  *** Got signal Terminated ***

and shutdown deadlocks (different bug!). but sending another SIGINT fails to kill the process.

the signals are registered with

  register_async_signal_handler_oneshot(SIGINT, handle_mgr_signal);
  register_async_signal_handler_oneshot(SIGTERM, handle_mgr_signal);

and the oneshot sets the SA_RESETHAND flag,
  act.sa_flags = SA_SIGINFO | (oneshot ? SA_RESETHAND : 0);

I tested this works correctly with a kludge to ceph_mgr.cc (see attached), but it's not working later on for some reason!


Files

kludge.patch (1.25 KB) kludge.patch Sage Weil, 11/22/2019 08:20 PM

Related issues 1 (0 open1 closed)

Related to Dashboard - Bug #42744: mgr/dashboard: Executing the run-backend-api-tests script results in infinite loopResolvedPatrick Donnelly

Actions
Actions

Also available in: Atom PDF