Project

General

Profile

Actions

Bug #24587

closed

librados api aio tests race condition

Added by Josh Durgin almost 6 years ago. Updated over 5 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Q/A
Tags:
Backport:
mimic, luminous
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Seen in a jewel integration branch with no OSD changes:

http://pulpito.ceph.com/yuriw-2018-06-12_22:32:43-rados-wip-yuri4-testing-2018-06-12-2037-jewel-distro-basic-smithi/2660182/

2018-06-13T04:58:42.608 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: [ RUN      ] LibRadosAio.SimpleStatPP
2018-06-13T04:58:42.608 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: test/librados/aio.cc:1209: Failure
2018-06-13T04:58:42.608 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: Value of: my_completion2->get_return_value()
2018-06-13T04:58:42.608 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio:   Actual: -2
2018-06-13T04:58:42.609 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: Expected: 0
2018-06-13T04:58:42.609 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: [  FAILED  ] LibRadosAio.SimpleStatPP (2307 ms)
2018-06-13T04:58:42.610 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: [ RUN      ] LibRadosAio.SimpleStatNS
2018-06-13T04:58:42.610 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: test/librados/aio.cc:1256: Failure
2018-06-13T04:58:42.610 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: Value of: rados_aio_get_return_value(my_completion2)
2018-06-13T04:58:42.612 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio:   Actual: -2
2018-06-13T04:58:42.612 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: Expected: 0
2018-06-13T04:58:42.617 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: [ RUN      ] LibRadosAioEC.RoundTripPP2
2018-06-13T04:58:42.617 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: test/librados/aio.cc:2119: Failure
2018-06-13T04:58:42.617 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: Value of: my_completion2->get_return_value()
2018-06-13T04:58:42.618 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio:   Actual: -2
2018-06-13T04:58:42.618 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: Expected: (int)sizeof(buf)
2018-06-13T04:58:42.618 INFO:tasks.workunit.client.0.smithi092.stdout:                  api_aio: Which is: 128

Related issues 2 (0 open2 closed)

Copied to RADOS - Backport #36646: luminous: librados api aio tests race conditionResolvedNathan CutlerActions
Copied to RADOS - Backport #36647: mimic: librados api aio tests race conditionResolvedNathan CutlerActions
Actions #1

Updated by Josh Durgin almost 6 years ago

http://pulpito.ceph.com/yuriw-2018-06-13_14:55:30-rados-wip-yuri4-testing-2018-06-12-2037-jewel-distro-basic-smithi/2662751/

2018-06-13T21:06:03.574 INFO:tasks.workunit.client.0.smithi118.stdout:                  api_aio: [ RUN      ] LibRadosAioEC.RoundTrip2
2018-06-13T21:06:03.574 INFO:tasks.workunit.client.0.smithi118.stdout:                  api_aio: test/librados/aio.cc:2047: Failure
2018-06-13T21:06:03.574 INFO:tasks.workunit.client.0.smithi118.stdout:                  api_aio: Value of: rados_aio_get_return_value(my_completion2)
2018-06-13T21:06:03.574 INFO:tasks.workunit.client.0.smithi118.stdout:                  api_aio:   Actual: -2
2018-06-13T21:06:03.575 INFO:tasks.workunit.client.0.smithi118.stdout:                  api_aio: Expected: (int)sizeof(buf)
2018-06-13T21:06:03.575 INFO:tasks.workunit.client.0.smithi118.stdout:                  api_aio: Which is: 128
Actions #2

Updated by Josh Durgin almost 6 years ago

  • Subject changed from jewel: librados api test failures to librados api aio tests race condition
  • Assignee set to Josh Durgin

Good news, this is just a bug in the tests. They're submitting a write and then a read without waiting for the write to finish, so when the osd full thrashing blocks the write (newly backported to jewel) the read can re-order before it.

So this is not a blocker for jewel. Will fix up the tests in master + backport

Actions #3

Updated by Josh Durgin over 5 years ago

  • Status changed from New to Fix Under Review
Actions #4

Updated by Josh Durgin over 5 years ago

  • Backport set to mimic, luminous
Actions #5

Updated by Sage Weil over 5 years ago

  • Status changed from Fix Under Review to Pending Backport
Actions #6

Updated by Patrick Donnelly over 5 years ago

  • Copied to Backport #36646: luminous: librados api aio tests race condition added
Actions #7

Updated by Patrick Donnelly over 5 years ago

  • Copied to Backport #36647: mimic: librados api aio tests race condition added
Actions #8

Updated by Nathan Cutler over 5 years ago

  • Status changed from Pending Backport to Resolved
Actions

Also available in: Atom PDF