Project

General

Profile

Bug #43902

qa: mon_thrash: timeout "ceph quorum_status"

Added by Patrick Donnelly 22 days ago. Updated 17 days ago.

Status:
New
Priority:
Urgent
Assignee:
Category:
-
Target version:
% Done:

0%

Source:
Q/A
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(FS):
qa-suite
Labels (FS):
Pull request ID:
Crash signature:

Description

2020-01-25T23:57:09.282 INFO:teuthology.orchestra.run.smithi035:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph quorum_status
2020-01-25T23:57:33.932 INFO:teuthology.orchestra.run.smithi035:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-25T23:57:33.935 INFO:teuthology.orchestra.run.smithi174:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-25T23:58:03.990 INFO:teuthology.orchestra.run.smithi035:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-25T23:58:03.993 INFO:teuthology.orchestra.run.smithi174:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-25T23:58:34.111 INFO:teuthology.orchestra.run.smithi035:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-25T23:58:34.114 INFO:teuthology.orchestra.run.smithi174:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-25T23:59:04.215 INFO:teuthology.orchestra.run.smithi035:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-25T23:59:04.217 INFO:teuthology.orchestra.run.smithi174:> sudo logrotate /etc/logrotate.d/ceph-test.conf
2020-01-25T23:59:09.303 DEBUG:teuthology.orchestra.run:got remote process result: 124
2020-01-25T23:59:09.303 ERROR:tasks.mon_thrash.mon_thrasher:exception:
Traceback (most recent call last):
  File "/home/teuthworker/src/github.com_batrick_ceph_wip-pdonnell-testing-20200124.211519/qa/tasks/mon_thrash.py", line 232, in do_thrash
    self._do_thrash()
  File "/home/teuthworker/src/github.com_batrick_ceph_wip-pdonnell-testing-20200124.211519/qa/tasks/mon_thrash.py", line 323, in _do_thrash
    self.manager.wait_for_mon_quorum_size(len(mons))
  File "/home/teuthworker/src/github.com_batrick_ceph_wip-pdonnell-testing-20200124.211519/qa/tasks/ceph_manager.py", line 2850, in wait_for_mon_quorum_size
    while not len(self.get_mon_quorum()) == size:
  File "/home/teuthworker/src/github.com_batrick_ceph_wip-pdonnell-testing-20200124.211519/qa/tasks/ceph_manager.py", line 2839, in get_mon_quorum
    out = self.raw_cluster_cmd('quorum_status')
  File "/home/teuthworker/src/github.com_batrick_ceph_wip-pdonnell-testing-20200124.211519/qa/tasks/ceph_manager.py", line 1342, in raw_cluster_cmd
    stdout=StringIO(),
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/remote.py", line 198, in run
    r = self._runner(client=self.ssh, name=self.shortname, **kwargs)
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 433, in run
    r.wait()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 158, in wait
    self._raise_for_status()
  File "/home/teuthworker/src/git.ceph.com_git_teuthology_master/teuthology/orchestra/run.py", line 180, in _raise_for_status
    node=self.hostname, label=self.label
CommandFailedError: Command failed on smithi035 with status 124: 'sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph quorum_status'

Happens during snaptests with mon thrasher. Might need to extend the timeout?

History

#1 Updated by Patrick Donnelly 17 days ago

  • Assignee set to Ramana Raja

Also available in: Atom PDF