Bug #55723: octopus: dashboard failures due to timed-out or failed connections - Dashboard - Ceph

Actions

Copy link

Bug #55723

open

octopus: dashboard failures due to timed-out or failed connections

Added by Laura Flores almost 2 years ago. Updated almost 2 years ago.

Status:

Triaged

Priority:

Normal

Assignee:

Avan Thakkar

Category:

Testing & QA

Target version:

% Done:

Source:

Q/A

Tags:

Backport:

Regression:

Severity:

3 - minor

Reviewed:

Affected Versions:

ceph-qa-suite:

Pull request ID:

Crash signature (v1):

Crash signature (v2):

Description

Octopus runs in the teuthology rados suite are experiencing many consistent failures of this kind:

/a/yuriw-2022-05-19_14:09:24-rados-wip-yuri6-testing-2022-05-17-1603-octopus-distro-default-smithi/6841353

2022-05-19T14:30:03.022 INFO:tasks.cephfs_test_runner:======================================================================
2022-05-19T14:30:03.022 INFO:tasks.cephfs_test_runner:ERROR: test_standby (tasks.mgr.test_dashboard.TestDashboard)
2022-05-19T14:30:03.023 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2022-05-19T14:30:03.023 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2022-05-19T14:30:03.023 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_9dfe5561e7f8bbf1095613ed99b58dd72943d57a/qa/tasks/mgr/test_dashboard.py", line 62, in test_standby
2022-05-19T14:30:03.023 INFO:tasks.cephfs_test_runner:    self.wait_until_webserver_available(original_uri)
2022-05-19T14:30:03.024 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_9dfe5561e7f8bbf1095613ed99b58dd72943d57a/qa/tasks/mgr/test_dashboard.py", line 39, in wait_until_webserver_available
2022-05-19T14:30:03.024 INFO:tasks.cephfs_test_runner:    self.wait_until_true(_check_connection, timeout=30)
2022-05-19T14:30:03.024 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_9dfe5561e7f8bbf1095613ed99b58dd72943d57a/qa/tasks/ceph_test_case.py", line 196, in wait_until_true
2022-05-19T14:30:03.025 INFO:tasks.cephfs_test_runner:    raise TestTimeoutError("Timed out after {0}s".format(elapsed))
2022-05-19T14:30:03.025 INFO:tasks.cephfs_test_runner:tasks.ceph_test_case.TestTimeoutError: Timed out after 30s
2022-05-19T14:30:03.025 INFO:tasks.cephfs_test_runner:
2022-05-19T14:30:03.026 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------

From a comment by Laura this has not reproduced in the latest QA runs so it could be just a flaky test. Decreasing prio (I'll keep it open for a month and close it if not happening again).

Related issues 1 (0 open — 1 closed)

Actions

Copy link

Updated by Ernesto Puerta almost 2 years ago

Category set to Testing & QA
Assignee set to Avan Thakkar

Actions

Copy link

Updated by Ernesto Puerta almost 2 years ago

Status changed from New to Triaged

Actions

Copy link

Updated by Laura Flores almost 2 years ago

In terms of the first failure on here, test_standby for Dashboard, I looked into the Octopus git history, and the most recent commit in qa/tasks/mgr/test_dashboard.py is this one, which makes a direct modification to test_standby: https://github.com/ceph/ceph/commit/a1c9e6de01da2daa76ec2f323065d38be80317c6.

However, the most recent Octopus QA run that did not contain these failures was http://pulpito.front.sepia.ceph.com/yuriw-2022-04-26_20:58:55-rados-wip-yuri2-testing-2022-04-26-1132-octopus-distro-default-smithi/. I checked the branch that this run is associated with (ci/wip-yuri2-testing-2022-04-26-1132-octopus), and it does contain the commit I linked above. And the tests that are now failing were succeeding. So this seems like a recent development that is not linked to the introduction of that commit.

As for the other failures, test_standby/Prometheus and test_selftest_command_spam, those look different. Maybe a problem with python3.6?

Actions

Copy link

Updated by Ernesto Puerta almost 2 years ago

Copied to Bug #55774: octopus: prometheus, and selftest failures due to timed-out or failed connections added

Actions

Copy link

Updated by Ernesto Puerta almost 2 years ago

Subject changed from octopus: dashboard, prometheus, and selftest failures due to timed-out or failed connections to octopus: dashboard failures due to timed-out or failed connections
Description updated (diff)
Priority changed from Immediate to Normal

Actions

Copy link

Also available in: Atom PDF

Project

General

Profile

Ceph » mgr » Dashboard

Custom queries

Bug #55723

octopus: dashboard failures due to timed-out or failed connections

Updated by Ernesto Puerta almost 2 years ago

Updated by Ernesto Puerta almost 2 years ago

Updated by Laura Flores almost 2 years ago

Updated by Ernesto Puerta almost 2 years ago

Updated by Ernesto Puerta almost 2 years ago