Project

General

Profile

Bug #193

protocol error after control-c

Added by Sage Weil almost 14 years ago. Updated over 13 years ago.

Status:
Resolved
Priority:
High
Assignee:
-
Category:
-
Target version:
% Done:

0%

Source:
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

Saw this on wido's machine:

Jun 11 01:41:00 ceph-client kernel: [209231.250158] ceph: wait_request tid 8030 canceled/timed out
Jun 11 01:41:00 ceph-client kernel: [209231.251033] ceph: wait_request tid 8031 canceled/timed out
Jun 11 01:41:00 ceph-client kernel: [209231.251157] ceph: wait_request tid 8032 canceled/timed out
Jun 11 01:41:00 ceph-client kernel: [209231.256443] ceph: try_read bad con->in_tag = 62
Jun 11 01:41:00 ceph-client kernel: [209231.256557] ceph: osd4 192.168.6.215:6800 protocol error, garbage tag

after I hit control-c reading a very large file.

History

#1 Updated by Yehuda Sadeh almost 14 years ago

This was on the rbd branch, does it also happen on the unstable branch? The wait_for_completion_killable() might have fixed that.

#2 Updated by Sage Weil almost 14 years ago

  • Project changed from Ceph to Linux kernel client

#3 Updated by Sage Weil almost 14 years ago

Yehuda Sadeh wrote:

This was on the rbd branch, does it also happen on the unstable branch? The wait_for_completion_killable() might have fixed that.

I'm not sure the wait_for_* change would have changed it, only made it trigger more often. Let's see if it's reproducible on wido's cluster, though. (It didn't do it every time.) And then check if unstable is similarly affected. I couldn't trigger it on uml, fwiw.

#4 Updated by Sage Weil almost 14 years ago

  • Target version set to v2.6.35

#5 Updated by Sage Weil over 13 years ago

  • Priority changed from Normal to High

#6 Updated by Sage Weil over 13 years ago

  • Status changed from New to Resolved

I think this was caused by the message revocation bug fixed by commit:ed98adad3d87594c55347824e85137d1829c9e70, #252. It was setting the skip bytes out kvec_is_msg even if it wasn't the message being revoked.. which would definitely screw up the parsing of the incoming data stream.

And I can't reproduce it anymore, so closing this one out!

Also available in: Atom PDF