Bug #53138
cluster [WRN] Health check failed: Degraded data redundancy: 3/1164 objects degraded (0.258%) seen in rbd
Status:
Triaged
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:
0%
Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):
Description
2021-11-02T14:46:34.713 INFO:tasks.ceph:Scrubbing osd.0 2021-11-02T14:46:34.714 DEBUG:teuthology.orchestra.run.smithi008:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph tell osd.0 config set osd_debug_deep_scrub_sleep 0 2021-11-02T14:46:34.872 INFO:teuthology.orchestra.run.smithi008.stdout:{ 2021-11-02T14:46:34.873 INFO:teuthology.orchestra.run.smithi008.stdout: "success": "osd_debug_deep_scrub_sleep = '0.000000' (not observed, change may require restart) " 2021-11-02T14:46:34.873 INFO:teuthology.orchestra.run.smithi008.stdout:} 2021-11-02T14:46:34.885 DEBUG:teuthology.orchestra.run.smithi008:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd deep-scrub 0 2021-11-02T14:46:35.188 INFO:teuthology.orchestra.run.smithi008.stderr:instructed osd(s) 0 to deep-scrub 2021-11-02T14:46:35.199 INFO:tasks.ceph:Scrubbing osd.1 2021-11-02T14:46:35.199 DEBUG:teuthology.orchestra.run.smithi008:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph tell osd.1 config set osd_debug_deep_scrub_sleep 0 2021-11-02T14:46:35.343 INFO:teuthology.orchestra.run.smithi008.stdout:{ 2021-11-02T14:46:35.343 INFO:teuthology.orchestra.run.smithi008.stdout: "success": "osd_debug_deep_scrub_sleep = '0.000000' (not observed, change may require restart) " 2021-11-02T14:46:35.343 INFO:teuthology.orchestra.run.smithi008.stdout:} 2021-11-02T14:46:35.354 DEBUG:teuthology.orchestra.run.smithi008:> sudo adjust-ulimits ceph-coverage /home/ubuntu/cephtest/archive/coverage timeout 120 ceph --cluster ceph osd deep-scrub 1 2021-11-02T14:46:35.654 INFO:teuthology.orchestra.run.smithi008.stderr:instructed osd(s) 1 to deep-scrub 2021-11-02T14:46:35.664 INFO:tasks.ceph:Scrubbing osd.2
failure_reason: '"2021-11-02T14:37:56.579360+0000 mon.a (mon.0) 879 : cluster [WRN] Health check failed: Degraded data redundancy: 3/1164 objects degraded (0.258%), 3 pgs degraded (PG_DEGRADED)" in cluster log' flavor: default
although RBD tests finish fine, this is seen when trying to wind-up the tests.
/ceph/teuthology-archive/ideepika-2021-11-02_12:33:30-rbd-wip-ssd-cache-testing-distro-basic-smithi/6477559/teuthology.log
History
#1 Updated by Neha Ojha over 2 years ago
- Status changed from New to Triaged
This warning comes up because there are PGs recovering, probably because the test is injecting failures - we can ignore such warnings.
2021-11-02T14:37:55.923243+0000 mgr.x (mgr.14099) 334 : cluster [DBG] pgmap v865: 41 pgs: 1 active+recovering+undersized+remapped, 3 active+recovering+undersized+degraded+remapped, 37 active+clean; 454 MiB data, 1.4 GiB used, 719 GiB / 720 GiB avail; 22 MiB/s rd, 35 MiB/s wr, 1.48k op/s; 3/1164 objects degraded (0.258%); 22/1164 objects misplaced (1.890%); 2.6 MiB/s, 4 keys/s, 6 objects/s recovering 2021-11-02T14:37:57.569993+0000 mon.a (mon.0) 880 : cluster [DBG] osdmap e542: 8 total, 8 up, 8 in 2021-11-02T14:37:55.923243+0000 mgr.x (mgr.14099) 334 : cluster [DBG] pgmap v865: 41 pgs: 1 active+recovering+undersized+remapped, 3 active+recovering+undersized+degraded+remapped, 37 active+clean; 454 MiB data, 1.4 GiB used, 719 GiB / 720 GiB avail; 22 MiB/s rd, 35 MiB/s wr, 1.48k op/s; 3/1164 objects degraded (0.258%); 22/1164 objects misplaced (1.890%); 2.6 MiB/s, 4 keys/s, 6 objects/s recovering 2021-11-02T14:37:57.569993+0000 mon.a (mon.0) 880 : cluster [DBG] osdmap e542: 8 total, 8 up, 8 in 2021-11-02T14:37:57.923775+0000 mgr.x (mgr.14099) 335 : cluster [DBG] pgmap v867: 41 pgs: 1 active+recovering+undersized+remapped, 3 active+recovering+undersized+degraded+remapped, 37 active+clean; 454 MiB data, 1.4 GiB used, 719 GiB / 720 GiB avail; 734 KiB/s rd, 6.7 MiB/s wr, 557 op/s; 3/1164 objects degraded (0.258%); 22/1164 objects misplaced (1.890%); 2.2 MiB/s, 4 keys/s, 5 objects/s recovering 2021-11-02T14:37:58.570932+0000 mon.a (mon.0) 881 : cluster [DBG] osdmap e543: 8 total, 8 up, 8 in 2021-11-02T14:37:57.923775+0000 mgr.x (mgr.14099) 335 : cluster [DBG] pgmap v867: 41 pgs: 1 active+recovering+undersized+remapped, 3 active+recovering+undersized+degraded+remapped, 37 active+clean; 454 MiB data, 1.4 GiB used, 719 GiB / 720 GiB avail; 734 KiB/s rd, 6.7 MiB/s wr, 557 op/s; 3/1164 objects degraded (0.258%); 22/1164 objects misplaced (1.890%); 2.2 MiB/s, 4 keys/s, 5 objects/s recovering
#2 Updated by Deepika Upadhyay over 2 years ago
@Neha I am seeing these failures more than usual, maybe we might be having performance regression, if not, can we increase the timeout?
#3 Updated by Deepika Upadhyay over 2 years ago
- Priority changed from Normal to High
#4 Updated by Neha Ojha about 2 years ago
- Priority changed from High to Normal