Project

General

Profile

Actions

Bug #55805

open

qa failure: workload kernel_untar_build failed

Added by Rishabh Dave almost 2 years ago. Updated 6 months ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
fs
Component(FS):
qa-suite
Labels (FS):
qa, qa-failure
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Bug discovered on QA run for PR - https://github.com/ceph/ceph/pull/45556

Teuthology job - https://pulpito.ceph.com/vshankar-2022-04-26_06:23:29-fs:workload-wip-45556-20220418-102656-testing-default-smithi/6806484/
workload: kernel_untar_build.yaml

Traceback #1 -

    2022-04-26T07:46:30.191 INFO:tasks.cephfs.filesystem:scrub status for tag:3f516427-181e-4cf8-a57d-669202f3a4f5 - {'path': '/', 'tag': '3f516427-181e-4cf8-a57d-669202f3a4f5', 'options': 'recursive,force'}
    2022-04-26T07:46:30.192 ERROR:tasks.fwd_scrub.fs.[cephfs]:exception:
    Traceback (most recent call last):
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/fwd_scrub.py", line 38, in _run
        self.do_scrub()
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/fwd_scrub.py", line 55, in do_scrub
        self._scrub()
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/fwd_scrub.py", line 77, in _scrub
        timeout=self.scrub_timeout)
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/cephfs/filesystem.py", line 1617, in wait_until_scrub_complete
        while proceed():
      File "/home/teuthworker/src/git.ceph.com_git_teuthology_788cfdd8098ad222aa448289edcfa4436091c32c/teuthology/contextutil.py", line 133, in __call__
        raise MaxWhileTries(error_msg)
    teuthology.exceptions.MaxWhileTries: reached maximum tries (30) after waiting for 900 seconds

Failing command traceback #1 - 2022-04-26T07:46:29.709 DEBUG:teuthology.orchestra.run.smithi052:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph tell mds.1:0 scrub status

Traceback #2 -

    Traceback (most recent call last):
      File "/home/teuthworker/src/git.ceph.com_git_teuthology_788cfdd8098ad222aa448289edcfa4436091c32c/teuthology/run_tasks.py", line 188, in run_tasks
        suppress = manager.__exit__(*exc_info)
      File "/usr/lib/python3.6/contextlib.py", line 88, in __exit__
        next(self.gen)
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/fwd_scrub.py", line 151, in task
        stop_all_fwd_scrubbers(ctx.ceph[config['cluster']].thrashers)
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/fwd_scrub.py", line 86, in stop_all_fwd_scrubbers
        raise RuntimeError(f"error during scrub thrashing: {thrasher.exception}")
    RuntimeError: error during scrub thrashing: reached maximum tries (30) after waiting for 900 seconds

Failing command traceback #2 - 2022-04-26T08:51:10.308 DEBUG:teuthology.orchestra.run.smithi138:> sudo /home/ubuntu/cephtest/cephadm --image quay.ceph.io/ceph-ci/ceph:1ccbc711b8876e630c0358e1d8d923daa34dca1e shell --fsid f2662818-c530-11ec-8c39-001a4aab830c -- ceph daemon mds.l perf dump

Traceback #3 -

    Traceback (most recent call last):
      File "/home/teuthworker/src/git.ceph.com_git_teuthology_788cfdd8098ad222aa448289edcfa4436091c32c/teuthology/contextutil.py", line 33, in nested
        yield vars
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/cephadm.py", line 1595, in task
        yield
      File "/home/teuthworker/src/git.ceph.com_git_teuthology_788cfdd8098ad222aa448289edcfa4436091c32c/teuthology/run_tasks.py", line 188, in run_tasks
        suppress = manager.__exit__(*exc_info)
      File "/usr/lib/python3.6/contextlib.py", line 88, in __exit__
        next(self.gen)
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/fwd_scrub.py", line 151, in task
        stop_all_fwd_scrubbers(ctx.ceph[config['cluster']].thrashers)
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/fwd_scrub.py", line 86, in stop_all_fwd_scrubbers
        raise RuntimeError(f"error during scrub thrashing: {thrasher.exception}")
    RuntimeError: error during scrub thrashing: reached maximum tries (30) after waiting for 900 seconds

Traceback #4 -

    Traceback (most recent call last):
      File "/home/teuthworker/src/git.ceph.com_git_teuthology_788cfdd8098ad222aa448289edcfa4436091c32c/teuthology/contextutil.py", line 33, in nested
        yield vars
      File "/home/teuthworker/src/git.ceph.com_git_teuthology_788cfdd8098ad222aa448289edcfa4436091c32c/teuthology/task/install/__init__.py", line 619, in task
        yield
      File "/home/teuthworker/src/git.ceph.com_git_teuthology_788cfdd8098ad222aa448289edcfa4436091c32c/teuthology/run_tasks.py", line 188, in run_tasks
        suppress = manager.__exit__(*exc_info)
      File "/usr/lib/python3.6/contextlib.py", line 88, in __exit__
        next(self.gen)
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/fwd_scrub.py", line 151, in task
        stop_all_fwd_scrubbers(ctx.ceph[config['cluster']].thrashers)
      File "/home/teuthworker/src/git.ceph.com_ceph-c_1ccbc711b8876e630c0358e1d8d923daa34dca1e/qa/tasks/fwd_scrub.py", line 86, in stop_all_fwd_scrubbers
        raise RuntimeError(f"error during scrub thrashing: {thrasher.exception}")
    RuntimeError: error during scrub thrashing: reached maximum tries (30) after waiting for 900 seconds

Actions #1

Updated by Rishabh Dave almost 2 years ago

  • Description updated (diff)
Actions #2

Updated by Rishabh Dave almost 2 years ago

  • ceph-qa-suite fs added
  • Component(FS) qa-suite added
  • Labels (FS) qa, qa-failure added
Actions #3

Updated by Milind Changire 6 months ago

Traceback (most recent call last):
  File "/home/teuthworker/src/git.ceph.com_teuthology_202b180cb047e798fb131047314a862593f45403/teuthology/orchestra/connection.py", line 106, in connect
    ssh.connect(**connect_args)
  File "/home/teuthworker/src/git.ceph.com_teuthology_202b180cb047e798fb131047314a862593f45403/virtualenv/lib/python3.8/site-packages/paramiko/client.py", line 450, in connect
    self._auth(
  File "/home/teuthworker/src/git.ceph.com_teuthology_202b180cb047e798fb131047314a862593f45403/virtualenv/lib/python3.8/site-packages/paramiko/client.py", line 781, in _auth
    raise saved_exception
  File "/home/teuthworker/src/git.ceph.com_teuthology_202b180cb047e798fb131047314a862593f45403/virtualenv/lib/python3.8/site-packages/paramiko/client.py", line 757, in _auth
    self._transport.auth_publickey(username, key)
  File "/home/teuthworker/src/git.ceph.com_teuthology_202b180cb047e798fb131047314a862593f45403/virtualenv/lib/python3.8/site-packages/paramiko/transport.py", line 1625, in auth_publickey
    raise SSHException("No existing session")
paramiko.ssh_exception.SSHException: No existing session

This reef job Looks mostly like a network connectivity/infra issue

This reef job does not look like an infra failure

Actions #4

Updated by Milind Changire 6 months ago

  • Assignee set to Rishabh Dave

Assigning to Rishabh to get attention.

Actions #5

Updated by Rishabh Dave 6 months ago

  • Assignee deleted (Rishabh Dave)
Actions

Also available in: Atom PDF