Project

General

Profile

Actions

Bug #48485

open

osd thrasher timeout

Added by Jeff Layton over 3 years ago. Updated over 3 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Tests
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

One of my test runs failed with this:

2020-12-07T16:18:00.235 INFO:teuthology.orchestra.run.gibba018:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-12-07T16:18:00.649 DEBUG:teuthology.orchestra.run:got remote process result: 124
2020-12-07T16:18:00.650 INFO:tasks.thrashosds.thrasher:Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 115, in wrapper
    return func(self)
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 1201, in _do_thrash
    self.choose_action()()
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 503, in out_osd
    self.ceph_manager.mark_out_osd(osd)
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 2700, in mark_out_osd
    self.raw_cluster_cmd('osd', 'out', str(osd))
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 1354, in raw_cluster_cmd
    'stdout': StringIO()}).stdout.getvalue()
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 1347, in run_cluster_cmd
    return self.controller.run(**kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 215, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 446, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 160, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 182, in _raise_for_status
    node=self.hostname, label=self.label
teuthology.exceptions.CommandFailedError: Command failed on gibba018 with status 124: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd out 2'

2020-12-07T16:18:00.651 ERROR:tasks.thrashosds.thrasher:exception:
Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 1069, in do_thrash
    self._do_thrash()
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 115, in wrapper
    return func(self)
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 1201, in _do_thrash
    self.choose_action()()
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 503, in out_osd
    self.ceph_manager.mark_out_osd(osd)
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 2700, in mark_out_osd
    self.raw_cluster_cmd('osd', 'out', str(osd))
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 1354, in raw_cluster_cmd
    'stdout': StringIO()}).stdout.getvalue()
  File "/home/teuthworker/src/github.com_jtlayton_ceph_k-stock/qa/tasks/ceph_manager.py", line 1347, in run_cluster_cmd
    return self.controller.run(**kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 215, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 446, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 160, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 182, in _raise_for_status
    node=self.hostname, label=self.label
teuthology.exceptions.CommandFailedError: Command failed on gibba018 with status 124: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd out 2'
2020-12-07T16:18:00.651 INFO:tasks.thrashosds.thrasher:joining the do_sighup greenlet
2020-12-07T16:18:00.652 INFO:tasks.thrashosds.thrasher:joining the do_optrack_toggle greenlet
2020-12-07T16:18:00.652 INFO:tasks.thrashosds.thrasher:joining the do_dump_ops greenlet
2020-12-07T16:18:00.652 INFO:tasks.thrashosds.thrasher:joining the do_noscrub_toggle greenlet
2020-12-07T16:18:00.652 INFO:tasks.ceph.ceph_manager.ceph:waiting for all up

See: https://pulpito.ceph.com/jlayton-2020-12-07_15:46:26-fs-master-wip-fscache-iter-basic-gibba/5689758/

This is testing against recent master (as of a few days ago), with some patches on top of the qa suite to allow for testing with cephfs + fscache.

Actions #1

Updated by Deepika Upadhyay over 3 years ago

seems related,adding here will verify later

/ceph/teuthology-archive/yuriw-2021-01-08_16:38:07-rados-wip-yuri4-testing-2021-01-07-1041-octopus-distro-basic-smithi/5766342/teuthology.log

Actions

Also available in: Atom PDF