Project

General

Profile

Actions

Bug #58837

open

mgr/test_progress.py: test_osd_healthy_recovery fails after timeout

Added by Laura Flores about 1 year ago. Updated about 1 year ago.

Status:
New
Priority:
Normal
Category:
-
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
quincy
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

/a/yuriw-2023-02-22_20:55:15-rados-wip-yuri4-testing-2023-02-22-0817-quincy-distro-default-smithi/7184746

2023-02-23T08:16:35.335 INFO:tasks.cephfs_test_runner:======================================================================
2023-02-23T08:16:35.335 INFO:tasks.cephfs_test_runner:ERROR: test_osd_healthy_recovery (tasks.mgr.test_progress.TestProgress)
2023-02-23T08:16:35.335 INFO:tasks.cephfs_test_runner:----------------------------------------------------------------------
2023-02-23T08:16:35.335 INFO:tasks.cephfs_test_runner:Traceback (most recent call last):
2023-02-23T08:16:35.335 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_cbccb547f47ec697c2e2ecf23392cc636ea19450/qa/tasks/mgr/test_progress.py", line 303, in test_osd_healthy_recovery
2023-02-23T08:16:35.335 INFO:tasks.cephfs_test_runner:    self.wait_until_true(lambda: self._is_complete(ev['id']),
2023-02-23T08:16:35.335 INFO:tasks.cephfs_test_runner:  File "/home/teuthworker/src/github.com_ceph_ceph-c_cbccb547f47ec697c2e2ecf23392cc636ea19450/qa/tasks/ceph_test_case.py", line 212, in wait_until_true
2023-02-23T08:16:35.335 INFO:tasks.cephfs_test_runner:    raise TestTimeoutError("Timed out after {0}s and {1} retries".format(elapsed, retry_count))
2023-02-23T08:16:35.335 INFO:tasks.cephfs_test_runner:tasks.ceph_test_case.TestTimeoutError: Timed out after 120s and 0 retries

Also seen in a main wip branch:
/a/lflores-2023-01-27_15:39:50-rados-wip-lflores-testing-2023-01-26-2227-distro-default-smithi/7141897

Actions #1

Updated by Laura Flores about 1 year ago

Seen in the mgr logs: 2 pgs stuck in recovery

{
    "PG_DEGRADED": {
        "severity": "HEALTH_WARN",
        "summary": {
            "message": "Degraded data redundancy: 2 pgs undersized",
            "count": 2
        },
        "detail": [
            {
                "message": "pg 7.3 is stuck undersized for 2m, current state active+recovering+undersized+remapped, last acting [2]" 
            },
            {
                "message": "pg 7.b is stuck undersized for 2m, current state active+recovering+undersized+remapped, last acting [2]" 
            }
        ]
    }
}

Actions #2

Updated by Laura Flores about 1 year ago

  • Project changed from mgr to RADOS
Actions #3

Updated by Radoslaw Zarzynski about 1 year ago

  • Assignee set to Kamoltat (Junior) Sirivadhna

Hi Junior! Would find some time for it?

Actions

Also available in: Atom PDF