Bug #49727
lazy_omap_stats_test: "ceph osd deep-scrub all" hangs
0%
Description
This has been seen in cases where all of pool 1 PGs are scrubbed and none of pool 2's. I suggest that this is because the mgr who handles the scrub request doesn't have an updated pgmap. The test could delay a little anywhere before issuing the scrub request.
2021-01-23T13:53:18.420 INFO:teuthology.orchestra.run.gibba021.stdout:Wrote 2000 omap keys of 445 bytes to the 350005e6-6ddd-44a6-950d-db89fed4a6c2 object 2021-01-23T13:53:18.433 INFO:teuthology.orchestra.run.gibba021.stdout:Wrote 2000 omap keys of 445 bytes to the b949555d-45df-48f4-ab5c-feb8f41221cd object 2021-01-23T13:53:18.434 INFO:teuthology.orchestra.run.gibba021.stdout:Scrubbing 2021-01-24T01:40:30.258 DEBUG:teuthology.exit:Got signal 15; running 2 handlers... 2021-01-24T01:40:30.304 DEBUG:teuthology.task.console_log:Killing console logger for gibba021 2021-01-24T01:40:30.305 DEBUG:teuthology.task.console_log:Killing console logger for gibba021 2021-01-24T01:40:30.305 DEBUG:teuthology.exit:Finished running handlers
/a/teuthology-2021-01-23_07:01:02-rados-master-distro-basic-gibba/5819506
/a/ideepika-2021-01-22_07:01:14-rados-wip-deepika-testing-master-2021-01-22-0047-distro-basic-smithi/5814891
Related issues
History
#1 Updated by David Zafman over 2 years ago
- Copied from Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs added
#2 Updated by David Zafman over 2 years ago
- Description updated (diff)
#3 Updated by David Zafman over 2 years ago
- Pull request ID deleted (
39535)
#4 Updated by David Zafman over 2 years ago
Note that instead of a delay you can tell the OSDs to flush their pg stats. I wonder if that flushes to the mon and eventually it gets the mgr or if it guarantees that the mgr is up to date too.
See the qa/standalone/ceph-helpers.sh function flush_pg_stats for the bash version that waits for all the flushes.
#5 Updated by Brad Hubbard over 2 years ago
- Pull request ID set to 39980
#6 Updated by Neha Ojha about 2 years ago
- Priority changed from Urgent to Normal
Haven't seen this recently.
#7 Updated by Laura Flores about 1 year ago
- Status changed from New to Pending Backport
- Backport changed from pacific to pacific,quincy
/a/yuriw-2022-08-11_16:46:00-rados-wip-yuri3-testing-2022-08-11-0809-pacific-distro-default-smithi/6968195
2022-08-11T22:32:05.016 INFO:teuthology.orchestra.run.smithi138.stdout:{"status":"HEALTH_OK","checks":{},"mutes":[]}
2022-08-11T22:32:05.017 INFO:tasks.ceph.ceph_manager.ceph:wait_until_healthy done
2022-08-11T22:32:05.017 INFO:teuthology.run_tasks:Running task exec...
2022-08-11T22:32:05.029 INFO:teuthology.task.exec:Executing custom commands...
2022-08-11T22:32:05.030 INFO:teuthology.task.exec:Running commands on role client.0 host ubuntu@smithi138.front.sepia.ceph.com
2022-08-11T22:32:05.030 DEBUG:teuthology.orchestra.run.smithi138:> sudo TESTDIR=/home/ubuntu/cephtest bash -c ceph_test_lazy_omap_stats
2022-08-11T22:32:05.500 INFO:teuthology.orchestra.run.smithi138.stdout:pool 'lazy_omap_test_pool' created
2022-08-11T22:32:05.504 INFO:teuthology.orchestra.run.smithi138.stdout:Created payload with 2000 keys of 445 bytes each. Total size in bytes = 890000
2022-08-11T22:32:05.504 INFO:teuthology.orchestra.run.smithi138.stdout:Waiting for active+clean
2022-08-11T22:32:05.757 INFO:teuthology.orchestra.run.smithi138.stdout:.
2022-08-11T22:32:06.544 INFO:teuthology.orchestra.run.smithi138.stdout:Wrote 2000 omap keys of 445 bytes to the f7c525bd-bd86-48cb-8ed1-4e673df56515 object
2022-08-11T22:32:06.544 INFO:teuthology.orchestra.run.smithi138.stdout:Scrubbing
2022-08-12T10:22:16.704 DEBUG:teuthology.exit:Got signal 15; running 1 handler...
2022-08-12T10:22:16.743 DEBUG:teuthology.task.console_log:Killing console logger for smithi138
2022-08-12T10:22:16.745 DEBUG:teuthology.exit:Finished running handlers
#8 Updated by Backport Bot about 1 year ago
- Copied to Backport #57208: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs added
#9 Updated by Backport Bot about 1 year ago
- Copied to Backport #57209: quincy: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs added
#10 Updated by Backport Bot about 1 year ago
- Tags set to backport_processed
#11 Updated by Laura Flores 7 months ago
- Tags set to test-failure
- Backport changed from pacific,quincy to pacific,quincy,reef
#12 Updated by Laura Flores 7 months ago
/a/yuriw-2023-03-10_22:46:37-rados-reef-distro-default-smithi/7203287
#13 Updated by Brad Hubbard 7 months ago
Laura Flores wrote:
/a/yuriw-2023-03-10_22:46:37-rados-reef-distro-default-smithi/7203287
This one is different and I'm looking at it in https://tracker.ceph.com/issues/59058
Also working on submitting a backport for pacific so this issue won't affect
that branch.
#14 Updated by Laura Flores about 1 month ago
/a/yuriw-2023-08-21_23:10:07-rados-pacific-release-distro-default-smithi/7375579