Actions
Bug #10531
closed"exception from mon" in upgrade:giant-x-next-distro-basic-vps run
Status:
Rejected
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Q/A
Tags:
Backport:
Regression:
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2015-01-13T05:08:25.127 INFO:teuthology.orchestra.run.vpm054:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph osd deep-scrub osd.6' 2015-01-13T05:08:25.227 INFO:teuthology.orchestra.run.vpm054.stderr:2015-01-13 13:08:25.226171 7f8a5b4cf700 0 -- :/1025157 >> 10.214.138.132:6791/0 pipe(0x21d3120 sd=7 :0 s=1 pgs=0 cs=0 l=1 c=0x21d33b0).fault 2015-01-13T05:08:28.371 INFO:teuthology.orchestra.run.vpm054.stderr:osd.6 instructed to deep-scrub 2015-01-13T05:08:28.381 INFO:teuthology.orchestra.run.vpm054:Running: 'adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage ceph pg dump --format json' 2015-01-13T05:08:28.722 INFO:teuthology.orchestra.run.vpm054.stderr:dumped all in format json 2015-01-13T05:08:28.821 INFO:teuthology.misc:Shutting down mds daemons... 2015-01-13T05:08:28.822 DEBUG:tasks.ceph.mds.a:waiting for process to exit 2015-01-13T05:08:34.769 INFO:tasks.ceph.mds.a:Stopped 2015-01-13T05:08:34.769 INFO:teuthology.misc:Shutting down osd daemons... 2015-01-13T05:08:34.770 DEBUG:tasks.ceph.osd.11:waiting for process to exit 2015-01-13T05:08:40.769 INFO:tasks.ceph.osd.11:Stopped 2015-01-13T05:08:40.770 DEBUG:tasks.ceph.osd.10:waiting for process to exit 2015-01-13T05:08:46.770 INFO:tasks.ceph.osd.10:Stopped 2015-01-13T05:08:50.508 DEBUG:tasks.ceph.osd.13:waiting for process to exit 2015-01-13T05:08:52.771 INFO:tasks.ceph.osd.13:Stopped 2015-01-13T05:08:52.772 DEBUG:tasks.ceph.osd.12:waiting for process to exit 2015-01-13T05:08:58.772 INFO:tasks.ceph.osd.12:Stopped 2015-01-13T05:08:58.772 DEBUG:tasks.ceph.osd.1:waiting for process to exit 2015-01-13T05:09:04.773 INFO:tasks.ceph.osd.1:Stopped 2015-01-13T05:09:04.773 DEBUG:tasks.ceph.osd.0:waiting for process to exit 2015-01-13T05:09:10.774 INFO:tasks.ceph.osd.0:Stopped 2015-01-13T05:09:14.228 DEBUG:tasks.ceph.osd.3:waiting for process to exit 2015-01-13T05:09:16.775 INFO:tasks.ceph.osd.3:Stopped 2015-01-13T05:09:16.775 DEBUG:tasks.ceph.osd.2:waiting for process to exit 2015-01-13T05:09:22.775 INFO:tasks.ceph.osd.2:Stopped 2015-01-13T05:09:22.776 DEBUG:tasks.ceph.osd.5:waiting for process to exit 2015-01-13T05:09:28.776 INFO:tasks.ceph.osd.5:Stopped 2015-01-13T05:09:28.777 DEBUG:tasks.ceph.osd.4:waiting for process to exit 2015-01-13T05:09:34.776 INFO:tasks.ceph.osd.4:Stopped 2015-01-13T05:09:35.114 DEBUG:tasks.ceph.osd.7:waiting for process to exit 2015-01-13T05:09:40.777 INFO:tasks.ceph.osd.7:Stopped 2015-01-13T05:09:40.778 DEBUG:tasks.ceph.osd.6:waiting for process to exit 2015-01-13T05:09:46.778 INFO:tasks.ceph.osd.6:Stopped 2015-01-13T05:09:46.779 DEBUG:tasks.ceph.osd.9:waiting for process to exit 2015-01-13T05:09:52.779 INFO:tasks.ceph.osd.9:Stopped 2015-01-13T05:09:52.780 DEBUG:tasks.ceph.osd.8:waiting for process to exit 2015-01-13T05:09:58.780 INFO:tasks.ceph.osd.8:Stopped 2015-01-13T05:10:00.202 INFO:teuthology.misc:Shutting down mon daemons... 2015-01-13T05:10:00.202 DEBUG:tasks.ceph.mon.a:waiting for process to exit 2015-01-13T05:10:04.781 INFO:tasks.ceph.mon.a:Stopped 2015-01-13T05:10:04.781 DEBUG:tasks.ceph.mon.c:waiting for process to exit 2015-01-13T05:10:04.781 ERROR:teuthology.misc:Saw exception from mon.c Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 1109, in stop_daemons_of_type daemon.stop() File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 45, in stop run.wait([self.proc], timeout=timeout) File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 391, in wait proc.wait() File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 106, in wait exitstatus=status, node=self.hostname) CommandFailedError: Command failed on vpm077 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f -i c'
Updated by Yuri Weinstein over 9 years ago
Same problem in http://qa-proxy.ceph.com/teuthology/teuthology-2015-01-12_17:05:02-upgrade:giant-x-next-distro-basic-vps/700109/
2015-01-13T17:21:28.043 DEBUG:tasks.ceph.mon.a:waiting for process to exit 2015-01-13T17:21:34.043 INFO:tasks.ceph.mon.a:Stopped 2015-01-13T17:21:34.043 DEBUG:tasks.ceph.mon.c:waiting for process to exit 2015-01-13T17:21:40.044 INFO:tasks.ceph.mon.c:Stopped 2015-01-13T17:21:40.044 DEBUG:tasks.ceph.mon.b:waiting for process to exit 2015-01-13T17:21:40.044 ERROR:teuthology.misc:Saw exception from mon.b Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 1109, in stop_daemons_of_type daemon.stop() File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 45, in stop run.wait([self.proc], timeout=timeout) File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 391, in wait proc.wait() File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 106, in wait exitstatus=status, node=self.hostname) CommandFailedError: Command failed on vpm132 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f -i b'
Updated by Yuri Weinstein over 9 years ago
Looks like similar issue in this job:
2015-01-13T11:35:47.331 ERROR:teuthology.misc:Saw exception from mon.a Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 1109, in stop_daemons_of_type daemon.stop() File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 45, in stop run.wait([self.proc], timeout=timeout) File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 391, in wait proc.wait() File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 106, in wait exitstatus=status, node=self.hostname) CommandFailedError: Command failed on vpm084 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f -i a'
Updated by Yuri Weinstein over 9 years ago
Similar issue but with memory problem.
Run: http://pulpito.ceph.com/teuthology-2015-01-21_17:05:02-upgrade:giant-x-next-distro-basic-vps/
Job: 716357
2015-01-22T05:11:19.959 INFO:tasks.ceph.osd.6.vpm042.stderr:2015-01-22 13:11:19.725139 7f75bfcee700 -1 osd.6 2505 lsb_release_parse - failed to call lsb_release binary with error: (12) Cannot allocate memory 2015-01-22T05:11:19.973 INFO:tasks.ceph.osd.6.vpm042.stderr:2015-01-22 13:11:19.743148 7f75bfcee700 -1 osd.6 2505 lsb_release_parse - failed to call lsb_release binary with error: (12) Cannot allocate memory 2015-01-22T05:11:20.059 INFO:tasks.ceph.osd.6.vpm042.stderr:2015-01-22 13:11:19.978533 7f75bfcee700 -1 osd.6 2505 lsb_release_parse - failed to call lsb_release binary with error: (12) Cannot allocate memory
and then
2015-01-22T05:52:22.594 DEBUG:tasks.ceph.mon.a:waiting for process to exit 2015-01-22T05:52:28.594 INFO:tasks.ceph.mon.a:Stopped 2015-01-22T05:52:28.594 DEBUG:tasks.ceph.mon.c:waiting for process to exit 2015-01-22T05:52:28.595 ERROR:teuthology.misc:Saw exception from mon.c Traceback (most recent call last): File "/home/teuthworker/src/teuthology_master/teuthology/misc.py", line 1110, in stop_daemons_of_type daemon.stop() File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/daemon.py", line 45, in stop run.wait([self.proc], timeout=timeout) File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 391, in wait proc.wait() File "/home/teuthworker/src/teuthology_master/teuthology/orchestra/run.py", line 106, in wait exitstatus=status, node=self.hostname) CommandFailedError: Command failed on vpm042 with status 1: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage daemon-helper kill ceph-mon -f -i c'
Maybe other are also due to memory allocation issue?
Updated by Sage Weil over 9 years ago
- Status changed from New to Rejected
i think these are all memory. we need to reduce the vps count per machine i think :(
Actions