Bug #7888
closedmsgr: keepalive is insufficient
0%
Description
the current keepalive behavior relies on writes triggering a tcp timeout/error, which does not actually happy in many cases (like ifdown eth0).
instead, we need something like a request/reply exchange to guarantee liveness.
Updated by Sage Weil about 10 years ago
- Status changed from In Progress to Fix Under Review
Updated by Sage Weil about 10 years ago
wip-7888 handles this for MonClient. We can do the same with Objecter, but this is less critical because we will find out via the osdmap if they are really down.
There is a bit of a concern about this whole approach, though: if the server isn't reading data because it has hit its memory throttle, the client may time out and reconnect, resending a bunch of the same messages, making the memory pressure even worse. It really is better if this can be handled a bit lower down in the protocol layer.
Alternatively, the Messenger throttle could be redone so that it is cooperative and never stops reading data off the socket. But then we end up having to implement all the same flow control that TCP is already giving us... meh!
Updated by Sage Weil about 10 years ago
- Status changed from Fix Under Review to Pending Backport
Updated by Sage Weil about 10 years ago
- Status changed from Pending Backport to Resolved
Updated by Greg Farnum about 5 years ago
- Project changed from Ceph to Messengers
- Category deleted (
msgr)