Bug #46405
closedosd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1
0%
Description
2020-07-07T23:54:01.124 INFO:tasks.workunit.client.0.smithi022.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:227: TEST_rados_repair_warning: ceph pg 2.0 query 2020-07-07T23:54:01.124 INFO:tasks.workunit.client.0.smithi022.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:227: TEST_rados_repair_warning: jq .info.stats.stat_sum.num_objects_repaired 2020-07-07T23:54:01.260 INFO:tasks.workunit.client.0.smithi022.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:227: TEST_rados_repair_warning: COUNT=21 2020-07-07T23:54:01.260 INFO:tasks.workunit.client.0.smithi022.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:228: TEST_rados_repair_warning: expr 11 '*' 2 2020-07-07T23:54:01.262 INFO:tasks.workunit.client.0.smithi022.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:228: TEST_rados_repair_warning: test 21 = 22 2020-07-07T23:54:01.262 INFO:tasks.workunit.client.0.smithi022.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:228: TEST_rados_repair_warning: return 1 2020-07-07T23:54:01.262 INFO:tasks.workunit.client.0.smithi022.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:42: run: return 1
/a/nojha-2020-07-07_21:05:58-rados:standalone-master-distro-basic-smithi/5207209
/a/nojha-2020-07-07_21:05:58-rados:standalone-master-distro-basic-smithi/5207210
Updated by Neha Ojha almost 4 years ago
- Related to Feature #41564: Issue health status warning if num_shards_repaired exceeds some threshold added
Updated by Neha Ojha almost 4 years ago
- Priority changed from Normal to High
Updated by Neha Ojha almost 4 years ago
- Backport set to nautilus,octopus
Since the original feature is being backported to nautilus and octopus.
/a/yuriw-2020-07-06_17:23:10-rados-wip-yuri8-testing-2020-07-01-2358-octopus-distro-basic-smithi/5203825
Updated by Neha Ojha almost 4 years ago
- Priority changed from High to Urgent
/a/yuriw-2020-07-13_23:06:23-rados-wip-yuri5-testing-2020-07-13-1944-octopus-distro-basic-smithi/5224649
Updated by David Zafman almost 4 years ago
I'm not seeing this on my build machine using run-standalone.sh
Updated by Kefu Chai almost 4 years ago
/a/kchai-2020-07-27_15:50:48-rados-wip-kefu-testing-2020-07-27-2127-distro-basic-smithi/5261869
Updated by Brad Hubbard almost 4 years ago
/a/yuriw-2020-08-06_00:31:28-rados-wip-yuri8-testing-octopus-distro-basic-smithi/5291111
Updated by Brad Hubbard over 3 years ago
/a/yuriw-2020-08-27_00:49:53-rados-wip-yuri8-testing-2020-08-26-2329-octopus-distro-basic-smithi/5379176/
Updated by Brad Hubbard over 3 years ago
- Assignee set to Brad Hubbard
Here's the actual problem I think. Working on a fix.
2020-08-27T07:07:31.848 INFO:tasks.workunit.client.0.smithi098.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:188: TEST_rados_repair_warning: local obj-base=obj-warn- 2020-08-27T07:07:31.849 INFO:tasks.workunit.client.0.smithi098.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh: line 188: local: `obj-base=obj-warn-': not a valid identifier
Updated by Kefu Chai over 3 years ago
2020-09-10T21:46:03.940 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:172: TEST_rados_get_with_eio: rados_get_data eio td/osd-r ep-recov-eio.sh ... 2020-09-10T21:46:16.890 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:143: rados_get_data: rados_get td/osd-rep-recov-eio.sh po ol-rep obj-eio-859439 2020-09-10T21:46:16.891 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:80: rados_get: local dir=td/osd-rep-recov-eio.sh 2020-09-10T21:46:16.891 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:81: rados_get: local poolname=pool-rep 2020-09-10T21:46:16.891 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:82: rados_get: local objname=obj-eio-859439 2020-09-10T21:46:16.891 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:83: rados_get: local expect=ok 2020-09-10T21:46:16.891 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:88: rados_get: '[' ok = fail ']' 2020-09-10T21:46:16.891 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:96: rados_get: '[' ok = hang ']' 2020-09-10T21:46:16.892 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:105: rados_get: rados --pool pool-rep get obj-eio-859439 td/osd-rep-recov-eio.sh/COPY 2020-09-10T21:46:17.017 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:106: rados_get: diff td/osd-rep-recov-eio.sh/ORIGINAL td/ osd-rep-recov-eio.sh/COPY 2020-09-10T21:46:17.017 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:107: rados_get: rm td/osd-rep-recov-eio.sh/COPY 2020-09-10T21:46:17.019 INFO:tasks.workunit.client.0.smithi007.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:145: rados_get_data: ceph pg 2.0 query 2020-09-10T21:46:17.019 INFO:tasks.workunit.client.0.smithi007.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:145: rados_get_data: jq .info.stats.stat_sum.num_objects _repaired 2020-09-10T21:46:17.154 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:145: rados_get_data: COUNT=2 2020-09-10T21:46:17.155 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:146: rados_get_data: test 2 = 3 2020-09-10T21:46:17.155 INFO:tasks.workunit.client.0.smithi007.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:146: rados_get_data: return 1
/a/kchai-2020-09-10_16:44:13-rados-wip-kefu-testing-2020-09-10-1633-distro-basic-smithi/5421813/teuthology.log
Updated by Brad Hubbard over 3 years ago
Kefu,
/a/kchai-2020-09-10_16:44:13-rados-wip-kefu-testing-2020-09-10-1633-distro-basic-smithi/5421813/teuthology.log may be a different problem since it's happening in TEST_rados_get_with_eio (earlier than TEST_rados_repair_warning) and not showing the 'not a valid identifier' message.
Updated by Kefu Chai over 3 years ago
Brad, thanks. will create a separate ticket.
Updated by Neha Ojha over 3 years ago
2020-09-22T19:40:43.835 INFO:tasks.workunit.client.0.smithi134.stderr://home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:228: TEST_rados_repair_warning: expr 11 '*' 2 2020-09-22T19:40:43.837 INFO:tasks.workunit.client.0.smithi134.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:228: TEST_rados_repair_warning: test 21 = 22 2020-09-22T19:40:43.837 INFO:tasks.workunit.client.0.smithi134.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:228: TEST_rados_repair_warning: return 1 2020-09-22T19:40:43.837 INFO:tasks.workunit.client.0.smithi134.stderr:/home/ubuntu/cephtest/clone.client.0/qa/standalone/osd/osd-rep-recov-eio.sh:42: run: return 1
/a/teuthology-2020-09-22_07:01:02-rados-master-distro-basic-smithi/5458830
Updated by Neha Ojha over 3 years ago
/a/teuthology-2020-09-25_07:01:01-rados-master-distro-basic-smithi/5466817
Updated by David Zafman over 3 years ago
This change fixes the odd object names in the subtest, but shouldn't change help fix this problem. On my build machine, using run-standalone.sh the subtest passes with and without the change below. Could we need a short sleep before query in order to let things update for all test cases?
$ git diff diff --git a/qa/standalone/osd/osd-rep-recov-eio.sh b/qa/standalone/osd/osd-rep-recov-eio.sh index 613bfc316f7..6929e580d9f 100755 --- a/qa/standalone/osd/osd-rep-recov-eio.sh +++ b/qa/standalone/osd/osd-rep-recov-eio.sh @@ -185,7 +185,7 @@ function TEST_rados_repair_warning() { wait_for_clean || return 1 local poolname=pool-rep - local obj-base=obj-warn- + local objbase=obj-warn local inject=eio for i in $(seq 1 $OBJS)
Updated by Neha Ojha over 3 years ago
/a/teuthology-2020-09-29_07:01:02-rados-master-distro-basic-smithi/5480928
Updated by David Zafman over 3 years ago
- Status changed from New to In Progress
- Assignee changed from Brad Hubbard to David Zafman
- Pull request ID set to 37483
Updated by Neha Ojha over 3 years ago
/a/teuthology-2020-09-30_07:01:02-rados-master-distro-basic-smithi/5483631
Updated by Neha Ojha over 3 years ago
- Status changed from In Progress to Fix Under Review
Updated by Neha Ojha over 3 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Nathan Cutler over 3 years ago
- Copied to Backport #47825: nautilus: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1 added
Updated by Nathan Cutler over 3 years ago
- Copied to Backport #47826: octopus: osd/osd-rep-recov-eio.sh: TEST_rados_repair_warning: return 1 added
Updated by David Zafman over 3 years ago
- Status changed from Pending Backport to Resolved