Project

General

Profile

Bug #49727

lazy_omap_stats_test: "ceph osd deep-scrub all" hangs

Added by David Zafman about 3 years ago. Updated 3 months ago.

Status:
Resolved
Priority:
Normal
Assignee:
Category:
-
Target version:
-
% Done:

100%

Source:
Tags:
backport_processed
Backport:
pacific,quincy,reef
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

This has been seen in cases where all of pool 1 PGs are scrubbed and none of pool 2's. I suggest that this is because the mgr who handles the scrub request doesn't have an updated pgmap. The test could delay a little anywhere before issuing the scrub request.

2021-01-23T13:53:18.420 INFO:teuthology.orchestra.run.gibba021.stdout:Wrote 2000 omap keys of 445 bytes to the 350005e6-6ddd-44a6-950d-db89fed4a6c2 object
2021-01-23T13:53:18.433 INFO:teuthology.orchestra.run.gibba021.stdout:Wrote 2000 omap keys of 445 bytes to the b949555d-45df-48f4-ab5c-feb8f41221cd object
2021-01-23T13:53:18.434 INFO:teuthology.orchestra.run.gibba021.stdout:Scrubbing
2021-01-24T01:40:30.258 DEBUG:teuthology.exit:Got signal 15; running 2 handlers...
2021-01-24T01:40:30.304 DEBUG:teuthology.task.console_log:Killing console logger for gibba021
2021-01-24T01:40:30.305 DEBUG:teuthology.task.console_log:Killing console logger for gibba021
2021-01-24T01:40:30.305 DEBUG:teuthology.exit:Finished running handlers

/a/teuthology-2021-01-23_07:01:02-rados-master-distro-basic-gibba/5819506

/a/ideepika-2021-01-22_07:01:14-rados-wip-deepika-testing-master-2021-01-22-0047-distro-basic-smithi/5814891


Related issues

Copied from RADOS - Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs Resolved
Copied to RADOS - Backport #57208: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs Resolved
Copied to RADOS - Backport #57209: quincy: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs Resolved

History

#1 Updated by David Zafman about 3 years ago

  • Copied from Bug #48984: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs added

#2 Updated by David Zafman about 3 years ago

  • Description updated (diff)

#3 Updated by David Zafman about 3 years ago

  • Pull request ID deleted (39535)

#4 Updated by David Zafman about 3 years ago

Note that instead of a delay you can tell the OSDs to flush their pg stats. I wonder if that flushes to the mon and eventually it gets the mgr or if it guarantees that the mgr is up to date too.

See the qa/standalone/ceph-helpers.sh function flush_pg_stats for the bash version that waits for all the flushes.

#5 Updated by Brad Hubbard about 3 years ago

  • Pull request ID set to 39980

#6 Updated by Neha Ojha over 2 years ago

  • Priority changed from Urgent to Normal

Haven't seen this recently.

#7 Updated by Laura Flores over 1 year ago

  • Status changed from New to Pending Backport
  • Backport changed from pacific to pacific,quincy

/a/yuriw-2022-08-11_16:46:00-rados-wip-yuri3-testing-2022-08-11-0809-pacific-distro-default-smithi/6968195

2022-08-11T22:32:05.016 INFO:teuthology.orchestra.run.smithi138.stdout:{"status":"HEALTH_OK","checks":{},"mutes":[]}
2022-08-11T22:32:05.017 INFO:tasks.ceph.ceph_manager.ceph:wait_until_healthy done
2022-08-11T22:32:05.017 INFO:teuthology.run_tasks:Running task exec...
2022-08-11T22:32:05.029 INFO:teuthology.task.exec:Executing custom commands...
2022-08-11T22:32:05.030 INFO:teuthology.task.exec:Running commands on role client.0 host ubuntu@smithi138.front.sepia.ceph.com
2022-08-11T22:32:05.030 DEBUG:teuthology.orchestra.run.smithi138:> sudo TESTDIR=/home/ubuntu/cephtest bash -c ceph_test_lazy_omap_stats
2022-08-11T22:32:05.500 INFO:teuthology.orchestra.run.smithi138.stdout:pool 'lazy_omap_test_pool' created
2022-08-11T22:32:05.504 INFO:teuthology.orchestra.run.smithi138.stdout:Created payload with 2000 keys of 445 bytes each. Total size in bytes = 890000
2022-08-11T22:32:05.504 INFO:teuthology.orchestra.run.smithi138.stdout:Waiting for active+clean
2022-08-11T22:32:05.757 INFO:teuthology.orchestra.run.smithi138.stdout:.
2022-08-11T22:32:06.544 INFO:teuthology.orchestra.run.smithi138.stdout:Wrote 2000 omap keys of 445 bytes to the f7c525bd-bd86-48cb-8ed1-4e673df56515 object
2022-08-11T22:32:06.544 INFO:teuthology.orchestra.run.smithi138.stdout:Scrubbing
2022-08-12T10:22:16.704 DEBUG:teuthology.exit:Got signal 15; running 1 handler...
2022-08-12T10:22:16.743 DEBUG:teuthology.task.console_log:Killing console logger for smithi138
2022-08-12T10:22:16.745 DEBUG:teuthology.exit:Finished running handlers

#8 Updated by Backport Bot over 1 year ago

  • Copied to Backport #57208: pacific: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs added

#9 Updated by Backport Bot over 1 year ago

  • Copied to Backport #57209: quincy: lazy_omap_stats_test: "ceph osd deep-scrub all" hangs added

#10 Updated by Backport Bot over 1 year ago

  • Tags set to backport_processed

#11 Updated by Laura Flores about 1 year ago

  • Tags set to test-failure
  • Backport changed from pacific,quincy to pacific,quincy,reef

#12 Updated by Laura Flores about 1 year ago

/a/yuriw-2023-03-10_22:46:37-rados-reef-distro-default-smithi/7203287

#13 Updated by Brad Hubbard about 1 year ago

Laura Flores wrote:

/a/yuriw-2023-03-10_22:46:37-rados-reef-distro-default-smithi/7203287

This one is different and I'm looking at it in https://tracker.ceph.com/issues/59058

Also working on submitting a backport for pacific so this issue won't affect
that branch.

#14 Updated by Laura Flores 7 months ago

/a/yuriw-2023-08-21_23:10:07-rados-pacific-release-distro-default-smithi/7375579

#16 Updated by Konstantin Shalygin 3 months ago

  • Status changed from Pending Backport to Resolved
  • % Done changed from 0 to 100

Also available in: Atom PDF