Project

General

Profile

Actions

Bug #64873

open

mgr/dashboard: Health check failed: Degraded data redundancy: 2/6 objects degraded (33.333%), 1 pg degraded (PG_DEGRADED)" in cluster log

Added by Sridhar Seshasayee about 2 months ago. Updated about 1 month ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
Component - Orchestrator
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Description of problem

/a/yuriw-2024-03-08_16:20:46-rados-wip-yuri4-testing-2024-03-05-0854-distro-default-smithi/7587933

Description: rados/dashboard/{0-single-container-host debug/mgr mon_election/classic random-objectstore$/{bluestore-comp-zstd} tasks/e2e}

Actual results

The warning is created during test 04-osds.e2e-spec.ts and specifically when the cluster is being brought up.
During this phase PGs are degraded and therefore the cluster warning is expected.

Logs:

2024-03-10T03:56:36.510 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:36 smithi019 ceph-mon[29456]: Deploying daemon osd.3 on smithi113
2024-03-10T03:56:37.509 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:37 smithi019 ceph-mon[29456]: pgmap v364: 1 pgs: 1 active+clean; 577 KiB data, 80 MiB used, 268 GiB / 268 GiB avail
2024-03-10T03:56:39.510 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:39 smithi019 ceph-mon[29456]: pgmap v365: 1 pgs: 1 active+clean; 577 KiB data, 80 MiB used, 268 GiB / 268 GiB avail
2024-03-10T03:56:41.509 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:41 smithi019 ceph-mon[29456]: pgmap v366: 1 pgs: 1 active+clean; 577 KiB data, 80 MiB used, 268 GiB / 268 GiB avail
2024-03-10T03:56:43.009 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:42 smithi019 ceph-mon[29456]: from='mgr.14152 172.21.15.19:0/13991753' entity='mgr.a'
2024-03-10T03:56:43.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:42 smithi019 ceph-mon[29456]: from='mgr.14152 172.21.15.19:0/13991753' entity='mgr.a'
2024-03-10T03:56:43.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:42 smithi019 ceph-mon[29456]: from='mgr.14152 172.21.15.19:0/13991753' entity='mgr.a' cmd=[{"prefix": "auth get", "entity": "osd.4"}]: dispatch
2024-03-10T03:56:43.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:42 smithi019 ceph-mon[29456]: from='mgr.14152 172.21.15.19:0/13991753' entity='mgr.a' cmd=[{"prefix": "config generate-minimal-conf"}]: dispatch
2024-03-10T03:56:43.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:42 smithi019 ceph-mon[29456]: Deploying daemon osd.4 on smithi113
2024-03-10T03:56:44.009 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:43 smithi019 ceph-mon[29456]: pgmap v367: 1 pgs: 1 active+clean; 577 KiB data, 80 MiB used, 268 GiB / 268 GiB avail
2024-03-10T03:56:46.009 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:45 smithi019 ceph-mon[29456]: pgmap v368: 1 pgs: 1 active+clean; 577 KiB data, 80 MiB used, 268 GiB / 268 GiB avail
2024-03-10T03:56:46.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:45 smithi019 ceph-mon[29456]: from='osd.3 [v2:172.21.15.113:6800/286344501,v1:172.21.15.113:6801/286344501]' entity='osd.3' cmd=[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["3"]}]: dispatch
2024-03-10T03:56:47.009 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:46 smithi019 ceph-mon[29456]: from='osd.3 [v2:172.21.15.113:6800/286344501,v1:172.21.15.113:6801/286344501]' entity='osd.3' cmd='[{"prefix": "osd crush set-device-class", "class": "hdd", "ids": ["3"]}]': finished
2024-03-10T03:56:47.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:46 smithi019 ceph-mon[29456]: osdmap e25: 6 total, 3 up, 6 in
2024-03-10T03:56:47.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:46 smithi019 ceph-mon[29456]: from='osd.3 [v2:172.21.15.113:6800/286344501,v1:172.21.15.113:6801/286344501]' entity='osd.3' cmd=[{"prefix": "osd crush create-or-move", "id": 3, "weight":0.0146, "args": ["host=smithi113", "root=default"]}]: dispatch
2024-03-10T03:56:47.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:46 smithi019 ceph-mon[29456]: from='mgr.14152 172.21.15.19:0/13991753' entity='mgr.a' cmd=[{"prefix": "osd metadata", "id": 3}]: dispatch
2024-03-10T03:56:47.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:46 smithi019 ceph-mon[29456]: from='mgr.14152 172.21.15.19:0/13991753' entity='mgr.a' cmd=[{"prefix": "osd metadata", "id": 4}]: dispatch
2024-03-10T03:56:47.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:46 smithi019 ceph-mon[29456]: from='mgr.14152 172.21.15.19:0/13991753' entity='mgr.a' cmd=[{"prefix": "osd metadata", "id": 5}]: dispatch
2024-03-10T03:56:47.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:46 smithi019 ceph-mon[29456]: from='osd.3 [v2:172.21.15.113:6800/286344501,v1:172.21.15.113:6801/286344501]' entity='osd.3' cmd='[{"prefix": "osd crush create-or-move", "id": 3, "weight":0.0146, "args": ["host=smithi113", "root=default"]}]': finished
2024-03-10T03:56:47.010 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:46 smithi019 ceph-mon[29456]: osdmap e26: 6 total, 3 up, 6 in

...

2024-03-10T03:57:00.260 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:56:59 smithi019 ceph-mon[29456]: Health check failed: Degraded data redundancy: 2/6 objects degraded (33.333%), 1 pg degraded (PG_DEGRADED)

 ...

2024-03-10T03:57:03.760 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:57:03 smithi019 ceph-mon[29456]: Health check cleared: PG_DEGRADED (was: Degraded data redundancy: 2/6 objects degraded (33.333%), 1 pg degraded)
2024-03-10T03:57:03.760 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:57:03 smithi019 ceph-mon[29456]: Cluster is now healthy

...

2024-03-10T03:58:14.759 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:58:14 smithi019 ceph-mon[29456]: osd.3 [v2:172.21.15.113:6800/286344501,v1:172.21.15.113:6801/286344501] boot
2024-03-10T03:58:14.759 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:58:14 smithi019 ceph-mon[29456]: osdmap e37: 6 total, 6 up, 6 in
2024-03-10T03:58:14.760 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:58:14 smithi019 ceph-mon[29456]: from='mgr.14152 172.21.15.19:0/13991753' entity='mgr.a' cmd=[{"prefix": "osd metadata", "id": 3}]: dispatch
2024-03-10T03:58:15.759 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:58:15 smithi019 ceph-mon[29456]: pgmap v426: 1 pgs: 1 active+clean; 577 KiB data, 411 MiB used, 313 GiB / 313 GiB avail
2024-03-10T03:58:15.759 INFO:journalctl@ceph.mon.a.smithi019.stdout:Mar 10 03:58:15 smithi019 ceph-mon[29456]: osdmap e38: 6 total, 6 up, 6 in

The warning should probably be added to the ignorelist

Actions #1

Updated by Aishwarya Mathuria about 1 month ago

/a/yuriw-2024-03-19_00:09:45-rados-wip-yuri5-testing-2024-03-18-1144-distro-default-smithi/7610095

Actions

Also available in: Atom PDF