Bug #58797
openscrub/osd-scrub-dump.sh: TEST_recover_unexpected fails from "ERROR: Unexpectedly low amount of scrub reservations seen during test"
0%
Description
/a/yuriw-2023-02-17_20:31:15-rados-main-distro-default-smithi/7179124
2023-02-18T00:15:10.798 INFO:tasks.workunit.client.0.smithi156.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:151: TEST_recover_unexpected: sleep 2
2023-02-18T00:15:12.800 INFO:tasks.workunit.client.0.smithi156.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:145: TEST_recover_unexpected: for i in $(seq 0 5)
2023-02-18T00:15:12.801 INFO:tasks.workunit.client.0.smithi156.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:147: TEST_recover_unexpected: ceph pg dump pgs
2023-02-18T00:15:12.801 INFO:tasks.workunit.client.0.smithi156.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:147: TEST_recover_unexpected: grep +scrubbing
2023-02-18T00:15:13.110 INFO:tasks.workunit.client.0.smithi156.stderr:dumped pgs
2023-02-18T00:15:13.123 INFO:tasks.workunit.client.0.smithi156.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:151: TEST_recover_unexpected: sleep 2
2023-02-18T00:15:15.124 INFO:tasks.workunit.client.0.smithi156.stdout:41 total reservations seen
2023-02-18T00:15:15.125 INFO:tasks.workunit.client.0.smithi156.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:154: TEST_recover_unexpected: echo 41 total reservations seen
2023-02-18T00:15:15.125 INFO:tasks.workunit.client.0.smithi156.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:159: TEST_recover_unexpected: expr 16 '*' 3 '*' 3
2023-02-18T00:15:15.127 INFO:tasks.workunit.client.0.smithi156.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:159: TEST_recover_unexpected: actual_reservations=144
2023-02-18T00:15:15.127 INFO:tasks.workunit.client.0.smithi156.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:160: TEST_recover_unexpected: '[' 41 -lt 144 ']'
2023-02-18T00:15:15.127 INFO:tasks.workunit.client.0.smithi156.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:161: TEST_recover_unexpected: echo 'ERROR: Unexpectedly low amount of scrub reservations seen during test'
2023-02-18T00:15:15.128 INFO:tasks.workunit.client.0.smithi156.stdout:ERROR: Unexpectedly low amount of scrub reservations seen during test
2023-02-18T00:15:15.128 INFO:tasks.workunit.client.0.smithi156.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:162: TEST_recover_unexpected: return 1
Updated by Laura Flores about 1 year ago
Also seen in a testing wip, but none of the PRs in the batch have been merged yet:
/a/lflores-2023-02-17_17:48:50-rados-wip-yuri10-testing-2023-02-15-1245-distro-default-smithi/7178890
2023-02-17T18:34:04.658 INFO:tasks.workunit.client.0.smithi155.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:159: TEST_recover_unexpected: expr 16 '*' 3 '*' 3
2023-02-17T18:34:04.659 INFO:tasks.workunit.client.0.smithi155.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:159: TEST_recover_unexpected: actual_reservations=144
2023-02-17T18:34:04.659 INFO:tasks.workunit.client.0.smithi155.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:160: TEST_recover_unexpected: '[' 64 -lt 144 ']'
2023-02-17T18:34:04.659 INFO:tasks.workunit.client.0.smithi155.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:161: TEST_recover_unexpected: echo 'ERROR: Unexpectedly low amount of scrub reservations seen during test'
2023-02-17T18:34:04.659 INFO:tasks.workunit.client.0.smithi155.stdout:ERROR: Unexpectedly low amount of scrub reservations seen during test
2023-02-17T18:34:04.660 INFO:tasks.workunit.client.0.smithi155.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:162: TEST_recover_unexpected: return 1
/a/yuriw-2023-02-15_23:04:45-rados-wip-yuri10-testing-2023-02-15-1245-distro-default-smithi/7175430
2023-02-15T23:48:45.954 INFO:tasks.workunit.client.0.smithi016.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:154: TEST_recover_unexpected: echo 37 total reservations seen
2023-02-15T23:48:45.954 INFO:tasks.workunit.client.0.smithi016.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:159: TEST_recover_unexpected: expr 16 '*' 3 '*' 3
2023-02-15T23:48:45.955 INFO:tasks.workunit.client.0.smithi016.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:159: TEST_recover_unexpected: actual_reservations=144
2023-02-15T23:48:45.955 INFO:tasks.workunit.client.0.smithi016.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:160: TEST_recover_unexpected: '[' 37 -lt 144 ']'
2023-02-15T23:48:45.955 INFO:tasks.workunit.client.0.smithi016.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:161: TEST_recover_unexpected: echo 'ERROR: Unexpectedly low amount of scrub reservations seen during test'
2023-02-15T23:48:45.956 INFO:tasks.workunit.client.0.smithi016.stdout:ERROR: Unexpectedly low amount of scrub reservations seen during test
2023-02-15T23:48:45.957 INFO:tasks.workunit.client.0.smithi016.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:162: TEST_recover_unexpected: return 1
Updated by Ronen Friedman about 1 year ago
- Status changed from New to In Progress
- Assignee set to Ronen Friedman
This is an unintended side effect of https://github.com/ceph/ceph/pull/44749. I will create a fix.
Explanation:
The failed test is far from robust: a cluster is instructed to scrub all its PGs, while we are
sampling the number of active scrub reservations continuously. The test is timed to more or
less have a chance to query each OSD at least once for each scrub. And, yes - that's not a great
idea.
The test sets configuration parameters such that the scrubs take long enough to be sampled.
But PR#44749 changed the set of parameters, and made those scrubs complete much too fast.
There are two possible fixes:
- a one line change to the test configuration, or
- removing the specific check altogether, as it has very very limited importance.
Update: I have implemented the first option.
Updated by Ronen Friedman about 1 year ago
- Status changed from In Progress to Fix Under Review
- Pull request ID set to 50236
Updated by Laura Flores about 1 year ago
- Status changed from Fix Under Review to Resolved
Updated by Laura Flores about 1 year ago
/a/lflores-2023-02-20_21:22:20-rados-wip-yuri-testing-2023-02-16-0839-distro-default-smithi/7181477
2023-02-22T06:30:47.137 INFO:tasks.workunit.client.0.smithi114.stderr:dumped pgs
2023-02-22T06:30:47.146 INFO:tasks.workunit.client.0.smithi114.stdout:2.b 65 0 0 0 0 532480 0 0 65 0 65 active+clean+scrubbing 2023-02-22T06:30:43.758219+0000 44'65 47:98 [3,2,5] 3 [3,2,5] 3 0'0 2023-02-15T06:27:56.060672+0000 0'0 2023-02-22T06:22:25.705988+0000 0 0 scrubbing for 0s 0 0
2023-02-22T06:30:47.146 INFO:tasks.workunit.client.0.smithi114.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:148: TEST_recover_unexpected: echo 'ERROR: Extra scrubs after test completion...not expected'
2023-02-22T06:30:47.147 INFO:tasks.workunit.client.0.smithi114.stdout:ERROR: Extra scrubs after test completion...not expected
2023-02-22T06:30:47.148 INFO:tasks.workunit.client.0.smithi114.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/scrub/osd-scrub-dump.sh:149: TEST_recover_unexpected: return 1
Updated by Laura Flores 12 months ago
- Status changed from Resolved to Pending Backport
- Backport set to reef
/a/yuriw-2023-04-27_14:24:15-rados-wip-yuri6-testing-2023-04-26-1247-reef-distro-default-smithi/7255773
Updated by Backport Bot 12 months ago
- Copied to Backport #59637: reef: scrub/osd-scrub-dump.sh: TEST_recover_unexpected fails from "ERROR: Unexpectedly low amount of scrub reservations seen during test" added