Project

General

Profile

Bug #2377

watch is lost if tcp timeout reached on the connection

Added by Yehuda Sadeh almost 12 years ago. Updated almost 12 years ago.

Status:
Resolved
Priority:
Normal
Assignee:
-
Category:
-
Target version:
-
% Done:

0%

Source:
Development
Tags:
Backport:
Regression:
No
Severity:
3 - minor
Reviewed:
Affected Versions:
ceph-qa-suite:
Pull request ID:
Crash signature (v1):
Crash signature (v2):

Description

Once tcp timeout is reached on a connection that has a watch on, we don't try to reconnect immediately (should we even have a timeout on that connection?). Second issue is that once we reconnect we don't try to reregister the watch. This can be reproduced like this:

scratchtoolpp --ms-tcp-read-timeout=10

History

#1 Updated by Greg Farnum almost 12 years ago

Anything that relies on the underlying TCP connection to stick around forever is going to be a problem for us as we scale up. It's possible SimpleMessenger bugs are part of this issue, and we may even need to work on the interface some to support watches well, (let me know) but we need to be able to support upwards of several thousand logical client connections per server and that means they can't require an active tcp connection.

#2 Updated by Yehuda Sadeh almost 12 years ago

  • Status changed from New to Resolved

Also available in: Atom PDF