Project

General

Profile

Bug #16803

DaemonState failed to stop a mon (didn't send signal to mon.c?)

Added by Kefu Chai over 7 years ago. Updated almost 7 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
rados
Crash signature (v1):
Crash signature (v2):

Description

see http://pulpito.ceph.com/kchai-2016-07-24_21:25:48-rados-wip-16801---basic-mira/

2016-07-24T22:09:00.677 INFO:teuthology.misc:Shutting down mon daemons...
2016-07-24T22:09:00.678 DEBUG:tasks.ceph.mon.a:waiting for process to exit
2016-07-24T22:09:00.678 INFO:teuthology.orchestra.run:waiting for 300
2016-07-24T22:09:00.727 INFO:tasks.ceph.mon.a.mira059.stderr:2016-07-25 05:09:00.700697 11e63700 -1 received  signal: Terminated from  PID: 19241 task name: /usr/bin/python UID: 0
2016-07-24T22:09:00.727 INFO:tasks.ceph.mon.a.mira059.stderr:2016-07-25 05:09:00.704260 11e63700 -1 mon.a@1(peon) e1 *** Got Signal Terminated ***
2016-07-24T22:09:06.678 INFO:tasks.ceph.mon.a:Stopped
2016-07-24T22:09:06.679 DEBUG:tasks.ceph.mon.c:waiting for process to exit
2016-07-24T22:09:06.679 INFO:teuthology.orchestra.run:waiting for 300
2016-07-24T22:14:00.717 INFO:tasks.ceph:Checking cluster log for badness...

2016-07-24T22:14:54.726 ERROR:teuthology.run_tasks:Manager failed: ceph
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/run_tasks.py", line 139, in run_tasks
    suppress = manager.__exit__(*exc_info)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/var/lib/teuthworker/src/ceph-qa-suite_master/tasks/ceph.py", line 1504, in task
    osd_scrub_pgs(ctx, config)
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/home/teuthworker/src/teuthology_master/teuthology/contextutil.py", line 46, in nested
    if exit(*exc):
  File "/usr/lib/python2.7/contextlib.py", line 24, in __exit__
    self.gen.next()
  File "/var/lib/teuthworker/src/ceph-qa-suite_master/tasks/ceph.py", line 1079, in run_daemon
    teuthology.stop_daemons_of_type(ctx, type_, cluster_name)
  File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 1206, in stop_daemons_of_type
    daemon.stop()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 46, in stop
    run.wait([self.proc], timeout=timeout)
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 413, in wait
    check_time()
  File "/home/teuthworker/src/teuthology_master/teuthology/contextutil.py", line 132, in __call__
    raise MaxWhileTries(error_msg)
MaxWhileTries: reached maximum tries (50) after waiting for 300 seconds

in remote/mira059/log/ceph-mon.a.log.gz

2016-07-25 05:09:00.700697 11e63700 -1 received  signal: Terminated from  PID: 19241 task name: /usr/bin/python UID: 0
2016-07-25 05:09:00.704260 11e63700 -1 mon.a@1(peon) e1 *** Got Signal Terminated ***
2016-07-25 05:09:00.706205 11e63700  1 mon.a@1(peon) e1 shutdown

$ zgrep Terminated remote/mira059/log/ceph-mon.c.log.gz

so i think, it might be a bug in teuthology.

History

#2 Updated by Josh Durgin almost 7 years ago

  • Project changed from Ceph to teuthology
  • Subject changed from DaemonState failed to stop a mon to DaemonState failed to stop a mon (didn't send signal to mon.c?)
  • Category deleted (teuthology)

Also available in: Atom PDF