Project

General

Profile

Actions

Feature #48402

closed

multisite option to enable keepalive

Added by Dieter Roels over 3 years ago. Updated 17 days ago.

Status:
Resolved
Priority:
Normal
Assignee:
Or Friedmann
Target version:
-
% Done:

0%

Source:
Community (user)
Tags:
Backport:
pacific
Reviewed:
Affected Versions:
Pull request ID:

Description

We have a multisite setup with firewalls between the two sites. The firewalls silently drop all connections that are idle for longer than one hour. /proc/sys/net/ipv4/tcp_keepalive_time is configured correctly, but it seems ceph does not use keepalive at all, so it results in frequent drops.

[adminuser@node1 ~]$ netstat -no | grep :443
tcp 0 0 10.10.11.11:443 10.10.12.13:54666 ESTABLISHED off (0.00/0/0)
tcp 0 0 10.10.11.11:6838 10.10.11.12:44344 ESTABLISHED off (0.00/0/0)
tcp 0 0 10.10.11.11:443 10.10.12.12:38604 ESTABLISHED off (0.00/0/0)
tcp 0 0 10.10.11.11:54960 10.10.12.11:443 ESTABLISHED off (0.00/0/0)
tcp 0 0 10.10.11.11:34454 10.10.12.13:443 ESTABLISHED off (0.00/0/0)
...

Would be nice to be able to enable keep_alive for the beast frontend so at least the multisite replication is using keepalive on the tcp connections

Actions

Also available in: Atom PDF