Project

General

Profile

Actions

Bug #40990

open

Luminous 12.2.12 Client Returning 'bad crc in data 0'

Added by JuanJose Galvez almost 5 years ago. Updated almost 5 years ago.

Status:
New
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)

I've been working with an application that uses librados and came across an interesting issue where the client app would start to loop and try to access the same rbd object in an infinite loop. The rados command using --debug_ms 1 gives an interesting hint at the issue 'bad crc in data 0'.

Please see the output of the rados get command attempting to retrieve an object from the cluster.

https://gist.github.com/jgalvez/7e7ce318f7c63ca6fa263d9e8855aee3

Also, the logging for this issue from the OSD side.

https://gist.github.com/jgalvez/ae2dda7ee32ec5727167f913943f99ba

The continued attempt to retrieve data ends up saturating the 10Gb bond on the primary OSD host. Also, restarting the OSD and waiting for it to rejoin (and take back primary) before trying to access the object seems to resolve the issue, at least temporarily.

Actions #1

Updated by JuanJose Galvez almost 5 years ago

Also, after the OSD was restarted (and took over primary again), I ran the rados get again.

https://gist.github.com/jgalvez/a0090079674d7d7242a462f7a884b299

Actions

Also available in: Atom PDF