Project

General

Profile

Actions

Bug #20616

closed

pre-luminous: aio_read returns erroneous data when rados_osd_op_timeout is set but not reach

Added by Mehdi Abaakouk almost 7 years ago. Updated over 6 years ago.

Status:
Resolved
Priority:
Urgent
Assignee:
Category:
Correctness/Safety
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
jewel
Regression:
No
Severity:
1 - critical
Reviewed:
Affected Versions:
ceph-qa-suite:
Component(RADOS):
librados
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Hi,

In Gnocchi, with use the python-rados API and we recently encounter some data corruption when "rados_osd_op_timeout" is set.
After digging, we end up that aio_read() doesn't return the expected data and doesn't return any error.

The issue on Gnocchi side: https://github.com/gnocchixyz/gnocchi/pull/190
This have been workarounded by doing read() instead of aio_read()

Ceph version was 10.2.7, but I can reproduce it on many other version.

I have attached a script to reproduce, it actual outputs:

no timeout read(): 'my fancy blob' : True
with timeout read(): 'my fancy blob' : True
no timeout aio_read(): 'my fancy blob' (length or errno: 13): True
with timeout aio_read(): 'exc_traceback' (length or errno: 13): False

The last line shows that aio_read doesn't return the expected blob.


Files

ceph_aio_read_timeout_bug.py (1.63 KB) ceph_aio_read_timeout_bug.py Mehdi Abaakouk, 07/13/2017 01:42 PM

Related issues 1 (0 open1 closed)

Copied to RADOS - Backport #21308: jewel: pre-luminous: aio_read returns erroneous data when rados_osd_op_timeout is set but not reachResolvedKefu ChaiActions
Actions

Also available in: Atom PDF