Project

General

Profile

Actions

Bug #11260

closed

"Cannot allocate memory" (??) in upgrade:dumpling-firefly-x-hammer-distro-basic-vps run

Added by Yuri Weinstein about 9 years ago. Updated over 8 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
hammer
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
upgrade/dumpling-firefly-x
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

All dead jobs in this run http://pulpito.front.sepia.ceph.com/teuthology-2015-03-26_17:15:01-upgrade:dumpling-firefly-x-hammer-distro-basic-vps/

One for example: 823052
Logs: http://qa-proxy.ceph.com/teuthology/teuthology-2015-03-26_17:15:01-upgrade:dumpling-firefly-x-hammer-distro-basic-vps/823052/

2015-03-27T13:54:30.776 INFO:tasks.ceph.osd.2.vpm186.stderr:terminate called after throwing an instance of 'ceph::buffer::bad_alloc'
2015-03-27T13:54:30.776 INFO:tasks.ceph.osd.2.vpm186.stderr:  what():  buffer::bad_alloc
2015-03-27T13:54:30.776 INFO:tasks.ceph.osd.2.vpm186.stderr:*** Caught signal (Aborted) **
2015-03-27T13:54:30.776 INFO:tasks.ceph.osd.2.vpm186.stderr: in thread 7f7457fcb700
2015-03-27T13:54:30.777 INFO:tasks.ceph.osd.3.vpm186.stderr:terminate called after throwing an instance of 'ceph::buffer::bad_alloc'
2015-03-27T13:54:30.777 INFO:tasks.ceph.osd.3.vpm186.stderr:  what():  buffer::bad_alloc
2015-03-27T13:54:30.777 INFO:tasks.ceph.osd.3.vpm186.stderr:*** Caught signal (Aborted) **
2015-03-27T13:54:30.777 INFO:tasks.ceph.osd.3.vpm186.stderr: in thread 7fdafa51c700
2015-03-27T13:54:30.840 INFO:tasks.ceph.osd.2.vpm186.stderr: ceph version 0.93-196-g6994648 (6994648bc443429dc2edfbb38fbaaa9a19e2bdd1)
2015-03-27T13:54:30.841 INFO:tasks.ceph.osd.2.vpm186.stderr: 1: ceph-osd() [0xa47b75]
2015-03-27T13:54:30.841 INFO:tasks.ceph.osd.2.vpm186.stderr: 2: (()+0xf710) [0x7f7487182710]
2015-03-27T13:54:30.841 INFO:tasks.ceph.osd.2.vpm186.stderr: 3: (gsignal()+0x35) [0x7f7485e4c635]
2015-03-27T13:54:30.842 INFO:tasks.ceph.osd.2.vpm186.stderr: 4: (abort()+0x175) [0x7f7485e4de15]
2015-03-27T13:54:30.842 INFO:tasks.ceph.osd.2.vpm186.stderr: 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7f7486706a5d]
2015-03-27T13:54:30.843 INFO:tasks.ceph.osd.2.vpm186.stderr: 6: (()+0xbcbe6) [0x7f7486704be6]
2015-03-27T13:54:30.843 INFO:tasks.ceph.osd.2.vpm186.stderr: 7: (()+0xbcc13) [0x7f7486704c13]
2015-03-27T13:54:30.843 INFO:tasks.ceph.osd.2.vpm186.stderr: 8: (()+0xbcd0e) [0x7f7486704d0e]
2015-03-27T13:54:30.844 INFO:tasks.ceph.osd.2.vpm186.stderr: 9: (ceph::buffer::create_aligned(unsigned int, unsigned int)+0x153) [0xb98f33]
2015-03-27T13:54:30.844 INFO:tasks.ceph.osd.2.vpm186.stderr: 10: (Pipe::read_message(Message**, AuthSessionHandler*)+0x219a) [0xc1506a]
2015-03-27T13:54:30.845 INFO:tasks.ceph.osd.2.vpm186.stderr: 11: (Pipe::reader()+0xa59) [0xc16479]
2015-03-27T13:54:30.845 INFO:tasks.ceph.osd.2.vpm186.stderr: 12: (Pipe::Reader::entry()+0xd) [0xc1984d]
2015-03-27T13:54:30.845 INFO:tasks.ceph.osd.2.vpm186.stderr: 13: (()+0x79d1) [0x7f748717a9d1]
2015-03-27T13:54:30.845 INFO:tasks.ceph.osd.2.vpm186.stderr: 14: (clone()+0x6d) [0x7f7485f0286d]
2015-03-27T13:54:30.846 INFO:tasks.ceph.osd.2.vpm186.stderr:2015-03-27 16:54:30.835470 7f7457fcb700 -1 *** Caught signal (Aborted) **
2015-03-27T13:54:30.846 INFO:tasks.ceph.osd.2.vpm186.stderr: in thread 7f7457fcb700
2015-03-27T13:54:30.846 INFO:tasks.ceph.osd.2.vpm186.stderr:
2015-03-27T13:54:30.846 INFO:tasks.ceph.osd.2.vpm186.stderr: ceph version 0.93-196-g6994648 (6994648bc443429dc2edfbb38fbaaa9a19e2bdd1)
2015-03-27T13:54:30.846 INFO:tasks.ceph.osd.2.vpm186.stderr: 1: ceph-osd() [0xa47b75]
2015-03-27T13:54:30.847 INFO:tasks.ceph.osd.2.vpm186.stderr: 2: (()+0xf710) [0x7f7487182710]
2015-03-27T13:54:30.847 INFO:tasks.ceph.osd.2.vpm186.stderr: 3: (gsignal()+0x35) [0x7f7485e4c635]
2015-03-27T13:54:30.847 INFO:tasks.ceph.osd.2.vpm186.stderr: 4: (abort()+0x175) [0x7f7485e4de15]
2015-03-27T13:54:30.847 INFO:tasks.ceph.osd.2.vpm186.stderr: 5: (__gnu_cxx::__verbose_terminate_handler()+0x12d) [0x7f7486706a5d]
2015-03-27T13:54:30.848 INFO:tasks.ceph.osd.2.vpm186.stderr: 6: (()+0xbcbe6) [0x7f7486704be6]
2015-03-27T13:54:30.848 INFO:tasks.ceph.osd.2.vpm186.stderr: 7: (()+0xbcc13) [0x7f7486704c13]
2015-03-27T13:54:30.848 INFO:tasks.ceph.osd.2.vpm186.stderr: 8: (()+0xbcd0e) [0x7f7486704d0e]
2015-03-27T13:54:30.848 INFO:tasks.ceph.osd.2.vpm186.stderr: 9: (ceph::buffer::create_aligned(unsigned int, unsigned int)+0x153) [0xb98f33]
2015-03-27T13:54:30.848 INFO:tasks.ceph.osd.2.vpm186.stderr: 10: (Pipe::read_message(Message**, AuthSessionHandler*)+0x219a) [0xc1506a]
2015-03-27T13:54:30.849 INFO:tasks.ceph.osd.2.vpm186.stderr: 11: (Pipe::reader()+0xa59) [0xc16479]
2015-03-27T13:54:30.849 INFO:tasks.ceph.osd.2.vpm186.stderr: 12: (Pipe::Reader::entry()+0xd) [0xc1984d]
2015-03-27T13:54:30.849 INFO:tasks.ceph.osd.2.vpm186.stderr: 13: (()+0x79d1) [0x7f748717a9d1]
2015-03-27T13:54:30.850 INFO:tasks.ceph.osd.2.vpm186.stderr: 14: (clone()+0x6d) [0x7f7485f0286d]
2015-03-27T13:54:30.850 INFO:tasks.ceph.osd.2.vpm186.stderr: NOTE: a copy of the executable, or `objdump -rdS <executable>` is needed to interpret this.

Also:

teuthology@teuthology:~$ grep -r "Cannot allocate memory" /a/teuthology-2015-03-26_17:15:01-upgrade:dumpling-firefly-x-hammer-distro-basic-vps/*/*.log
/a/teuthology-2015-03-26_17:15:01-upgrade:dumpling-firefly-x-hammer-distro-basic-vps/823054/teuthology.log:2015-03-27T14:16:20.120 INFO:teuthology.orchestra.run.vpm016.stderr:OSError: librados.so.2: cannot map zero-fill pages: Cannot allocate memory
/a/teuthology-2015-03-26_17:15:01-upgrade:dumpling-firefly-x-hammer-distro-basic-vps/823055/teuthology.log:2015-03-27T13:47:45.096 INFO:teuthology.orchestra.run.vpm070.stderr:OSError: librados.so.2: cannot map zero-fill pages: Cannot allocate memory
/a/teuthology-2015-03-26_17:15:01-upgrade:dumpling-firefly-x-hammer-distro-basic-vps/823057/teuthology.log:2015-03-27T21:30:08.889 INFO:teuthology.orchestra.run.vpm124.stderr:OSError: librados.so.2: cannot map zero-fill pages: Cannot allocate memory
/a/teuthology-2015-03-26_17:15:01-upgrade:dumpling-firefly-x-hammer-distro-basic-vps/823058/teuthology.log:2015-03-27T17:40:16.992 INFO:teuthology.orchestra.run.vpm060.stderr:OSError: librados.so.2: cannot map zero-fill pages: Cannot allocate memory

Sage, looks like vps memory problem, but need your confirmation.

Actions #1

Updated by Yuri Weinstein about 9 years ago

  • Subject changed from "Cannot allocate memory" in upgrade:dumpling-firefly-x-hammer-distro-basic-vps run to "Cannot allocate memory" (??) in upgrade:dumpling-firefly-x-hammer-distro-basic-vps run
Actions #2

Updated by Sage Weil about 9 years ago

  • Status changed from New to Won't Fix

environment

Actions #3

Updated by Loïc Dachary over 8 years ago

  • Regression set to No

rados/singleton-nomsgr/{all/msgr.yaml} on hammer, using OpenStack with 8GB RAM

2015-10-11T03:19:46.755 INFO:teuthology.orchestra.run.target241231.stderr:terminate called recursively
2015-10-11T03:19:46.762 INFO:teuthology.orchestra.run.target241231.stderr:terminate called after throwing an instance of 'ceph::buffer::bad_alloc'
2015-10-11T03:19:46.771 INFO:teuthology.orchestra.run.target241231.stderr:*** Caught signal (Aborted) **
2015-10-11T03:19:46.771 INFO:teuthology.orchestra.run.target241231.stderr: in thread 7fb7fc9d9700
2015-10-11T03:19:46.791 INFO:teuthology.orchestra.run.target241231.stderr:  what():  buffer::bad_alloc
2015-10-11T03:19:47.091 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/ubuntu/teuthology/teuthology/run_tasks.py", line 53, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/ubuntu/teuthology/teuthology/run_tasks.py", line 41, in run_one_task
    return fn(**kwargs)
  File "/home/ubuntu/teuthology/teuthology/task/exec.py", line 54, in task
    c],
  File "/home/ubuntu/teuthology/teuthology/orchestra/remote.py", line 156, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/ubuntu/teuthology/teuthology/orchestra/run.py", line 378, in run
    r.wait()
  File "/home/ubuntu/teuthology/teuthology/orchestra/run.py", line 114, in wait
    label=self.label)
CommandFailedError: Command failed on target241231 with status 134: 'sudo TESTDIR=/home/ubuntu/cephtest bash -c ceph_test_msgr'
Actions #4

Updated by Loïc Dachary over 8 years ago

Can be fixed in the context of OpenStack https://github.com/ceph/ceph-qa-suite/pull/623

Actions #5

Updated by Loïc Dachary over 8 years ago

  • Status changed from Won't Fix to Fix Under Review
  • Backport set to hammer
Actions #6

Updated by Sage Weil over 8 years ago

Loic Dachary wrote:

Can be fixed in the context of OpenStack https://github.com/ceph/ceph-qa-suite/pull/623

That doesn't help the upgrade test. I would call any failures with 3 osds per host noise (we need to handle that w/ the default environemnt).

Actions #7

Updated by Sage Weil over 8 years ago

Oh, i see now. lgtm!

Actions #8

Updated by Sage Weil over 8 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF