Project

General

Profile

Actions

Bug #10531

closed

"exception from mon" in upgrade:giant-x-next-distro-basic-vps run

Added by Yuri Weinstein over 9 years ago. Updated about 9 years ago.

Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Logs are in http://qa-proxy.ceph.com/teuthology/teuthology-2015-01-11_17:05:01-upgrade:giant-x-next-distro-basic-vps/698059/

2015-01-13T05:08:25.127 INFO:teuthology.orchestra.run.vpm054:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd deep-scrub osd.6'
2015-01-13T05:08:25.227 INFO:teuthology.orchestra.run.vpm054.stderr:2015-01-13 13:08:25.226171 7f8a5b4cf700  0 -- :/1025157 >> 10.214.138.132:6791/0 pipe(0x21d3120 sd=7 :0 s=1 pgs=0 cs=0 l=1 c=0x21d33b0).fault
2015-01-13T05:08:28.371 INFO:teuthology.orchestra.run.vpm054.stderr:osd.6 instructed to deep-scrub
2015-01-13T05:08:28.381 INFO:teuthology.orchestra.run.vpm054:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph pg dump --format json'
2015-01-13T05:08:28.722 INFO:teuthology.orchestra.run.vpm054.stderr:dumped all in format json
2015-01-13T05:08:28.821 INFO:teuthology.misc:Shutting down mds daemons...
2015-01-13T05:08:28.822 DEBUG:tasks.ceph.mds.a:waiting for process to exit
2015-01-13T05:08:34.769 INFO:tasks.ceph.mds.a:Stopped
2015-01-13T05:08:34.769 INFO:teuthology.misc:Shutting down osd daemons...
2015-01-13T05:08:34.770 DEBUG:tasks.ceph.osd.11:waiting for process to exit
2015-01-13T05:08:40.769 INFO:tasks.ceph.osd.11:Stopped
2015-01-13T05:08:40.770 DEBUG:tasks.ceph.osd.10:waiting for process to exit
2015-01-13T05:08:46.770 INFO:tasks.ceph.osd.10:Stopped
2015-01-13T05:08:50.508 DEBUG:tasks.ceph.osd.13:waiting for process to exit
2015-01-13T05:08:52.771 INFO:tasks.ceph.osd.13:Stopped
2015-01-13T05:08:52.772 DEBUG:tasks.ceph.osd.12:waiting for process to exit
2015-01-13T05:08:58.772 INFO:tasks.ceph.osd.12:Stopped
2015-01-13T05:08:58.772 DEBUG:tasks.ceph.osd.1:waiting for process to exit
2015-01-13T05:09:04.773 INFO:tasks.ceph.osd.1:Stopped
2015-01-13T05:09:04.773 DEBUG:tasks.ceph.osd.0:waiting for process to exit
2015-01-13T05:09:10.774 INFO:tasks.ceph.osd.0:Stopped
2015-01-13T05:09:14.228 DEBUG:tasks.ceph.osd.3:waiting for process to exit
2015-01-13T05:09:16.775 INFO:tasks.ceph.osd.3:Stopped
2015-01-13T05:09:16.775 DEBUG:tasks.ceph.osd.2:waiting for process to exit
2015-01-13T05:09:22.775 INFO:tasks.ceph.osd.2:Stopped
2015-01-13T05:09:22.776 DEBUG:tasks.ceph.osd.5:waiting for process to exit
2015-01-13T05:09:28.776 INFO:tasks.ceph.osd.5:Stopped
2015-01-13T05:09:28.777 DEBUG:tasks.ceph.osd.4:waiting for process to exit
2015-01-13T05:09:34.776 INFO:tasks.ceph.osd.4:Stopped
2015-01-13T05:09:35.114 DEBUG:tasks.ceph.osd.7:waiting for process to exit
2015-01-13T05:09:40.777 INFO:tasks.ceph.osd.7:Stopped
2015-01-13T05:09:40.778 DEBUG:tasks.ceph.osd.6:waiting for process to exit
2015-01-13T05:09:46.778 INFO:tasks.ceph.osd.6:Stopped
2015-01-13T05:09:46.779 DEBUG:tasks.ceph.osd.9:waiting for process to exit
2015-01-13T05:09:52.779 INFO:tasks.ceph.osd.9:Stopped
2015-01-13T05:09:52.780 DEBUG:tasks.ceph.osd.8:waiting for process to exit
2015-01-13T05:09:58.780 INFO:tasks.ceph.osd.8:Stopped
2015-01-13T05:10:00.202 INFO:teuthology.misc:Shutting down mon daemons...
2015-01-13T05:10:00.202 DEBUG:tasks.ceph.mon.a:waiting for process to exit
2015-01-13T05:10:04.781 INFO:tasks.ceph.mon.a:Stopped
2015-01-13T05:10:04.781 DEBUG:tasks.ceph.mon.c:waiting for process to exit
2015-01-13T05:10:04.781 ERROR:teuthology.misc:Saw exception from mon.c
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 1109, in stop_daemons_of_type
    daemon.stop()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 45, in stop
    run.wait([self.proc], timeout=timeout)
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 391, in wait
    proc.wait()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 106, in wait
    exitstatus=status, node=self.hostname)
CommandFailedError: Command failed on vpm077 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f -i c'
Actions #1

Updated by Yuri Weinstein over 9 years ago

Same problem in http://qa-proxy.ceph.com/teuthology/teuthology-2015-01-12_17:05:02-upgrade:giant-x-next-distro-basic-vps/700109/

2015-01-13T17:21:28.043 DEBUG:tasks.ceph.mon.a:waiting for process to exit
2015-01-13T17:21:34.043 INFO:tasks.ceph.mon.a:Stopped
2015-01-13T17:21:34.043 DEBUG:tasks.ceph.mon.c:waiting for process to exit
2015-01-13T17:21:40.044 INFO:tasks.ceph.mon.c:Stopped
2015-01-13T17:21:40.044 DEBUG:tasks.ceph.mon.b:waiting for process to exit
2015-01-13T17:21:40.044 ERROR:teuthology.misc:Saw exception from mon.b
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 1109, in stop_daemons_of_type
    daemon.stop()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 45, in stop
    run.wait([self.proc], timeout=timeout)
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 391, in wait
    proc.wait()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 106, in wait
    exitstatus=status, node=self.hostname)
CommandFailedError: Command failed on vpm132 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f -i b'
Actions #2

Updated by Yuri Weinstein over 9 years ago

Looks like similar issue in this job:

http://qa-proxy.ceph.com/teuthology/teuthology-2015-01-11_19:13:02-upgrade:dumpling-x-firefly-distro-basic-vps/698198/

2015-01-13T11:35:47.331 ERROR:teuthology.misc:Saw exception from mon.a
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 1109, in stop_daemons_of_type
    daemon.stop()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 45, in stop
    run.wait([self.proc], timeout=timeout)
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 391, in wait
    proc.wait()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 106, in wait
    exitstatus=status, node=self.hostname)
CommandFailedError: Command failed on vpm084 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f -i a'
Actions #3

Updated by Yuri Weinstein about 9 years ago

Similar issue but with memory problem.

Run: http://pulpito.ceph.com/teuthology-2015-01-21_17:05:02-upgrade:giant-x-next-distro-basic-vps/

Job: 716357

2015-01-22T05:11:19.959 INFO:tasks.ceph.osd.6.vpm042.stderr:2015-01-22 13:11:19.725139 7f75bfcee700 -1 osd.6 2505 lsb_release_parse - failed to call lsb_release binary with error: (12) Cannot allocate memory
2015-01-22T05:11:19.973 INFO:tasks.ceph.osd.6.vpm042.stderr:2015-01-22 13:11:19.743148 7f75bfcee700 -1 osd.6 2505 lsb_release_parse - failed to call lsb_release binary with error: (12) Cannot allocate memory
2015-01-22T05:11:20.059 INFO:tasks.ceph.osd.6.vpm042.stderr:2015-01-22 13:11:19.978533 7f75bfcee700 -1 osd.6 2505 lsb_release_parse - failed to call lsb_release binary with error: (12) Cannot allocate memory

and then

2015-01-22T05:52:22.594 DEBUG:tasks.ceph.mon.a:waiting for process to exit
2015-01-22T05:52:28.594 INFO:tasks.ceph.mon.a:Stopped
2015-01-22T05:52:28.594 DEBUG:tasks.ceph.mon.c:waiting for process to exit
2015-01-22T05:52:28.595 ERROR:teuthology.misc:Saw exception from mon.c
Traceback (most recent call last):
  File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 1110, in stop_daemons_of_type
    daemon.stop()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 45, in stop
    run.wait([self.proc], timeout=timeout)
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 391, in wait
    proc.wait()
  File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 106, in wait
    exitstatus=status, node=self.hostname)
CommandFailedError: Command failed on vpm042 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f -i c'

Maybe other are also due to memory allocation issue?

Actions #4

Updated by Sage Weil about 9 years ago

  • Status changed from New to Rejected

i think these are all memory. we need to reduce the vps count per machine i think :(

Actions

Also available in: Atom PDF