Bug #40990
Luminous 12.2.12 Client Returning 'bad crc in data 0'
0%
Description
ceph version 12.2.12 (1436006594665279fe734b4c15d7e08c13ebd777) luminous (stable)
I've been working with an application that uses librados and came across an interesting issue where the client app would start to loop and try to access the same rbd object in an infinite loop. The rados command using --debug_ms 1 gives an interesting hint at the issue 'bad crc in data 0'.
Please see the output of the rados get command attempting to retrieve an object from the cluster.
https://gist.github.com/jgalvez/7e7ce318f7c63ca6fa263d9e8855aee3
Also, the logging for this issue from the OSD side.
https://gist.github.com/jgalvez/ae2dda7ee32ec5727167f913943f99ba
The continued attempt to retrieve data ends up saturating the 10Gb bond on the primary OSD host. Also, restarting the OSD and waiting for it to rejoin (and take back primary) before trying to access the object seems to resolve the issue, at least temporarily.
History
#1 Updated by JuanJose Galvez over 4 years ago
Also, after the OSD was restarted (and took over primary again), I ran the rados get again.
https://gist.github.com/jgalvez/a0090079674d7d7242a462f7a884b299