Project

General

Profile

Actions

Bug #42889

closed

[msgr] continuous CRC mismatch under Ubuntu 18.04 causing spinning connection resets

Added by Jason Dillaman over 4 years ago. Updated over 4 years ago.

Status:
Resolved
Priority:
High
Assignee:
Jason Dillaman
Target version:
-
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

http://qa-proxy.ceph.com/teuthology/jdillaman-2019-11-19_13:54:56-rbd-wip-42598-distro-basic-smithi/4523756/teuthology.log

I've looked back against historical RBD suite runs on the master branch and for at least the past few months the "dynamic_features_no_cache" test has been regularly failing only under the Ubuntu 18 distro. I caught it in the act in the run above and increased the objecter and ms logs [1] to see that it was spinning on connect/resend/disconnect against OSD 7. I then increased the ms logs for OSD 7 [2] to find it complaining of a CRC mismatch upon the receipt of the first message:

2019-11-19T21:28:42.638+0000 7f7e444c6700 20 --2- [v2:172.21.15.25:6826/11408,v1:172.21.15.25:6827/11408] >> 172.21.15.87:0/402634445 conn(0x55bb6e815b00 0x55bb74086b00 crc :-1 s=THROTTLE_DONE pgs=10769522 cs=0 l=1 rx=0 tx=0).handle_read_frame_epilogue_main r=0
2019-11-19T21:28:42.638+0000 7f7e444c6700 20 --2- [v2:172.21.15.25:6826/11408,v1:172.21.15.25:6827/11408] >> 172.21.15.87:0/402634445 conn(0x55bb6e815b00 0x55bb74086b00 crc :-1 s=THROTTLE_DONE pgs=10769522 cs=0 l=1 rx=0 tx=0).handle_read_frame_epilogue_main message integrity check success:  expected_crc=1380803411 calculated_crc=1380803411
2019-11-19T21:28:42.638+0000 7f7e444c6700 20 --2- [v2:172.21.15.25:6826/11408,v1:172.21.15.25:6827/11408] >> 172.21.15.87:0/402634445 conn(0x55bb6e815b00 0x55bb74086b00 crc :-1 s=THROTTLE_DONE pgs=10769522 cs=0 l=1 rx=0 tx=0).handle_read_frame_epilogue_main message integrity check success:  expected_crc=354238021 calculated_crc=354238021
2019-11-19T21:28:42.638+0000 7f7e444c6700 20 --2- [v2:172.21.15.25:6826/11408,v1:172.21.15.25:6827/11408] >> 172.21.15.87:0/402634445 conn(0x55bb6e815b00 0x55bb74086b00 crc :-1 s=THROTTLE_DONE pgs=10769522 cs=0 l=1 rx=0 tx=0).handle_read_frame_epilogue_main message integrity check success:  expected_crc=4294967295 calculated_crc=4294967295
2019-11-19T21:28:42.638+0000 7f7e444c6700  5 --2- [v2:172.21.15.25:6826/11408,v1:172.21.15.25:6827/11408] >> 172.21.15.87:0/402634445 conn(0x55bb6e815b00 0x55bb74086b00 crc :-1 s=THROTTLE_DONE pgs=10769522 cs=0 l=1 rx=0 tx=0).handle_read_frame_epilogue_main message integrity check failed:  expected_crc=1969711020 calculated_crc=3757546011
2019-11-19T21:28:42.638+0000 7f7e444c6700 10 --2- [v2:172.21.15.25:6826/11408,v1:172.21.15.25:6827/11408] >> 172.21.15.87:0/402634445 conn(0x55bb6e815b00 0x55bb74086b00 crc :-1 s=THROTTLE_DONE pgs=10769522 cs=0 l=1 rx=0 tx=0)._fault
2019-11-19T21:28:42.638+0000 7f7e444c6700  2 --2- [v2:172.21.15.25:6826/11408,v1:172.21.15.25:6827/11408] >> 172.21.15.87:0/402634445 conn(0x55bb6e815b00 0x55bb74086b00 crc :-1 s=THROTTLE_DONE pgs=10769522 cs=0 l=1 rx=0 tx=0)._fault on lossy channel, failing

[1] http://qa-proxy.ceph.com/teuthology/jdillaman-2019-11-19_13:54:56-rbd-wip-42598-distro-basic-smithi/4523756/remote/smithi087/log/ceph-client.0.10972.log.gz
[2] http://qa-proxy.ceph.com/teuthology/jdillaman-2019-11-19_13:54:56-rbd-wip-42598-distro-basic-smithi/4523756/remote/smithi025/log/ceph-osd.7.log.gz

Actions

Also available in: Atom PDF