Project

General

Profile

Actions

Bug #48204

closed

admin socket terminates with std::bad_alloc

Added by Deepika Upadhyay over 3 years ago. Updated over 3 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
nautilus
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

http://qa-proxy.ceph.com/teuthology/yuriw-2020-11-10_16:01:13-rados-wip-yuri7-testing-2020-11-09-0731-nautilus-distro-basic-smithi/5609337/teuthology.log

PR's included in test batch:

https://github.com/ceph/ceph/pull/37382 - nautilus: mds: reduce memory usage of open file table prefetch
https://github.com/ceph/ceph/pull/37554 - nautilus: mon: set session_timeout when adding to session_map
https://github.com/ceph/ceph/pull/37589 - nautilus: mgr/progress: make it so progress bar does not get stuck forever
https://github.com/ceph/ceph/pull/37605 - nautilus: test/librados: fix endian bugs in checksum test cases

seems unrelated

 File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 160, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 178, in _raise_for_status
    raise CommandCrashedError(command=self.command)
teuthology.exceptions.CommandCrashedError: Command crashed: "sudo TESTDIR=/home/ubuntu/cephtest bash -c 'ceph_test_admin_socket_output --all'" 
2020-11-11T00:24:58.385 ERROR:teuthology.run_tasks: Sentry event: https://sentry.ceph.com/organizations/ceph/?query=ebba6117b7ed4f7b92c6e8b7479d532b
Traceback (most recent call last):
2020-11-11T00:24:58.350 INFO:teuthology.orchestra.run.smithi118.stdout:"/var/run/ceph/ceph-client.0.13014.asok" 
2020-11-11T00:24:58.351 INFO:teuthology.orchestra.run.smithi118.stdout:Found client socket "/var/run/ceph/ceph-client.0.13014.asok" 
2020-11-11T00:24:58.351 INFO:teuthology.orchestra.run.smithi118.stderr:terminate called after throwing an instance of 'std::bad_alloc'
2020-11-11T00:24:58.359 INFO:teuthology.orchestra.run.smithi118.stderr:  what():  std::bad_alloc
2020-11-11T00:24:58.360 DEBUG:teuthology.orchestra.run:got remote process result: None
2020-11-11T00:24:58.361 ERROR:teuthology.run_tasks:Saw exception from tasks.
Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 90, in run_tasks
    manager = run_one_task(taskname, ctx=ctx, config=config)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/run_tasks.py", line 69, in run_one_task
    return task(**kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/task/exec.py", line 54, in task
    c],
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 215, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 446, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 160, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 178, in _raise_for_status
    raise CommandCrashedError(command=self.command)
teuthology.exceptions.CommandCrashedError: Command crashed: "sudo TESTDIR=/home/ubuntu/cephtest bash -c 'ceph_test_admin_socket_output --all'" 
2020-11-11T00:24:58.385 ERROR:teuthology.run_tasks: Sentry event: https://sentry.ceph.com/organizations/ceph/?query=ebba6117b7ed4f7b92c6e8b7479d532b
Traceback (most recent call last):
Actions #1

Updated by Deepika Upadhyay over 3 years ago

  • Description updated (diff)
Actions #2

Updated by Casey Bodley over 3 years ago

@Deepika is this related to RGW, or should we move it to the Ceph project?

Actions #3

Updated by Deepika Upadhyay over 3 years ago

  • Project changed from rgw to Ceph

umm, https://tracker.ceph.com/issues/47179 and this failure seemed to be failing after similar rgw related calls, thought might be relevant/related, but I agree we can keep as greater subset (Ceph)

Actions #4

Updated by Brad Hubbard over 3 years ago

  • Assignee set to Brad Hubbard
  • Source set to Development
Actions #5

Updated by Brad Hubbard over 3 years ago

I'll take a look.

Actions #6

Updated by Brad Hubbard over 3 years ago

  • Pull request ID set to 38076

This is due to an invalidated iterator. I fixed this in master via https://tracker.ceph.com/issues/38846 but that was not backported and it's probably not worth backporting the whole thing now since it has additional dependencies so I've created a stand alone PR for this issue in nautilus.

Note that this is only a coding problem with the test itself.

Actions #7

Updated by Brad Hubbard over 3 years ago

  • Status changed from New to Fix Under Review
Actions #9

Updated by Brad Hubbard over 3 years ago

  • Status changed from Fix Under Review to Resolved
Actions

Also available in: Atom PDF