Project

General

Profile

Actions

Bug #42981

open

run-backend-api-tests.sh: mgr oneshot signal handlers do not revert to killing process

Added by Sage Weil over 4 years ago. Updated about 4 years ago.

Status:
New
Priority:
High
Assignee:
-
Category:
ceph-mgr
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

On master, when running the dashboard tests, the test tries to kill the mgr. The first SIGINT triggers shutdown of the python modules, but that invariable seems to hang due to some deadlock. Subsequent kill signals fail to kill the process.

To reproduce,

cd ../src/pybind/mgr/dashboard
./run-backend-api-tests.sh

and it will eventually time out trying to kill ceph-mgr.

The mgr log will note that the first signal was received,

2019-11-22T11:44:12.259-0600 7fef96fa4700 -1 received  signal: Terminated from python ../qa/tasks/vstart_runner.py --ignore-missing-binaries tasks.mgr.test_dashboard tasks.mgr.dashboard.test_auth tasks.mgr.dashboard.test_cephfs tasks.mgr.dashboard.test_cluster_configuration tasks.mgr.dashboard.test_erasure_code_pro
file tasks.mgr.dashboard.test_ganesha tasks.mgr.dashboard.test_health tasks.mgr.dashboard.test_host tasks.mgr.dashboard.test_logs tasks.mgr.dashboard.test_mgr_module tasks.mgr.dashboard.test_monitor tasks.mgr.dashboard.test_orchestrator tasks.mgr.dashboard.test_osd tasks.mgr.dashboard.test_perf_counters tasks.mgr.d
ashboard.test_pool tasks.mgr.dashboard.test_rbd_mirroring tasks.mgr.dashboard.test_rbd tasks.mgr.dashboard.test_requests tasks.mgr.dashboard.test_rgw tasks.mgr.dashboard.test_role tasks.mgr.dashboard.test_settings tasks.mgr.dashboard.test_summary tasks.mgr.dashboard.test_user tasks.mgr.test_module_selftest  (PID: 1
56456) UID: 1031
2019-11-22T11:44:12.259-0600 7fef96fa4700 -1 mgr handle_mgr_signal  *** Got signal Terminated ***

and shutdown deadlocks (different bug!). but sending another SIGINT fails to kill the process.

the signals are registered with

  register_async_signal_handler_oneshot(SIGINT, handle_mgr_signal);
  register_async_signal_handler_oneshot(SIGTERM, handle_mgr_signal);

and the oneshot sets the SA_RESETHAND flag,
  act.sa_flags = SA_SIGINFO | (oneshot ? SA_RESETHAND : 0);

I tested this works correctly with a kludge to ceph_mgr.cc (see attached), but it's not working later on for some reason!


Files

kludge.patch (1.25 KB) kludge.patch Sage Weil, 11/22/2019 08:20 PM

Related issues 1 (0 open1 closed)

Related to Dashboard - Bug #42744: mgr/dashboard: Executing the run-backend-api-tests script results in infinite loopResolvedPatrick Donnelly

Actions
Actions #1

Updated by Sage Weil over 4 years ago

To clarify: the signal handler is (supposed to be) installed as a one-shot: the first SIGINT/SIGTERM will trigger the handling code, and remove the signal handler, reverting to the default, so that the next SIGINT/SIGTERM just kills the process immediately. I'm not sure why this isn't happening, but what I observe is that a SIGTERM is sent hundreds of times but is seemingly ignored.

Actions #2

Updated by Sage Weil over 4 years ago

  • Related to Bug #42744: mgr/dashboard: Executing the run-backend-api-tests script results in infinite loop added
Actions #3

Updated by Patrick Donnelly over 4 years ago

  • Status changed from 12 to New
Actions #4

Updated by Sebastian Wagner over 4 years ago

  • Project changed from RADOS to mgr
  • Subject changed from mgr oneshot signal handlers do not revert to killing process to run-backend-api-tests.sh: mgr oneshot signal handlers do not revert to killing process
  • Category set to 151
Actions #5

Updated by Lenz Grimmer about 4 years ago

  • Category changed from 151 to ceph-mgr

Not actually a dashboard issue - updating Category accordingly.

Actions

Also available in: Atom PDF