Project

General

Profile

Actions

Bug #2573

closed

libceph: many "socket closed" messages

Added by Alex Elder almost 12 years ago. Updated over 11 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
Severity:
Reviewed:
Affected Versions:
ceph-qa-suite:
Crash signature (v1):
Crash signature (v2):

Description

While trying to reproduce a null pointer problem in the client
messenger code I was running xfstests #049 over RBD devices.

In the process, I'd get easily 10-15 messages like this on the
console of the RBD client for each iteration of test 049:

[  884.475733] libceph: osd1 10.214.131.33:6800 socket closed

I think this frequent socket closing was in fact what led to
the conditions under which the other problems I was chasing
could occur.

But stepping back from that, I don't believe there's any reason
those sockets should be closing, certainly not at that frequency.
These channels use TCP connections, so there should be no data
loss. I think the point of these ceph connections is to have
the connection survive loss of the TCP connection, but that
by no means suggests we should be suffering TCP connection losses
frequently. Only some sort of hardware event should lead to that,
it seems to me.

I believe these messages indicate that the other end of the socket
connection is closing the socket. And it's possible that the
server side of these connections is doing this for a reason. If
that's the case I question a bit of that design. And if it's not
the case then we should find out what's going on.

Actions

Also available in: Atom PDF