Documentation #52825
closedhaproxy causes high number of connection resets
0%
Description
I have discovered an interesting haproxy behaviour that causes a high number of connection resets on rgw hosts when HAProxy is used for ingress. I've posted earlier to the mailing list (and will again reference here) and now have found out that this behaviour is normal. HAProxy generates a lot of TCP connection resets which causes RGW/beast log messages (at least on debug level). After some research, this turns out to be static noise that confused me when I tried to debug problems with monitors going on and one with leader election.
I want to leave this here to collect my findings and make it easier for future users to find this information.
summary¶
If you deploy RGW with haproxy/keepalived such as cephadm does for ingress then you will see
- a high number of TCP connection resets and
- RGW/beast logging ERROR: client_io->complete_request() returned Connection reset by peer
.
Don't panic, it turns out that this behaviour is expected.
My suggestion is to switch off ingress temporarily until you have finished debugging.
my setting¶
I have a v16.2.6 cluster with 6 nodes (osd-1..osd-6) running with cehphadm. I migrated from a nautilus cluster originally installed with ceph-ansible that I then migrated to v15 with octopus and then v16.2.6 with cephadm.
For this issue, the following daemons are relevant.
# ceph orch ls NAME PORTS RUNNING REFRESHED AGE PLACEMENT ingress.rgw.default 172.16.62.26:443,1967 12/12 96s ago 2d count:6 rgw.default ?:8000 6/6 96s ago 2d count-per-host:1;label:rgw
This generates haproxy/keepalived configuration shown at the bottom.
Now to the part that confused me. I see a high number of connection resets, apparent in netstat output.
# netstat -s | grep -A 10 ^Tcp: Tcp: 2726076 active connections openings 1713993 passive connection openings 124711 failed connection attempts 1648262 connection resets received # <-- this is too high 14086 connections established 4238192699 segments received 24207750548 segments send out 32408995 segments retransmited 0 bad segments received. 1015230 resets sent
I see the following in the rgw logs (this repeats over and over every second or so).
# journalctl -f -u ceph-55633ec3-6c0c-4a02-990c-0f87e0f7a01f@rgw.default.osd-1.xqrjwp.service -- Logs begin at Sun 2021-10-03 10:49:24 CEST. -- Oct 06 07:54:43 osd-1 bash[90888]: debug 2021-10-06T05:54:43.875+0000 7fd701142700 1 ====== starting new request req=0x7fd8306b6620 ===== Oct 06 07:54:43 osd-1 bash[90888]: debug 2021-10-06T05:54:43.876+0000 7fd7e4308700 1 ====== req done req=0x7fd8306b6620 op status=0 http_status=200 latency=0.001000019s ====== Oct 06 07:54:43 osd-1 bash[90888]: debug 2021-10-06T05:54:43.876+0000 7fd7e4308700 1 beast: 0x7fd8306b6620: 172.16.62.12 - anonymous [06/Oct/2021:05:54:43.875 +0000] "HEAD / HTTP/1.0" 200 5 - - - latency=0.001000019s Oct 06 07:54:44 osd-1 bash[90888]: debug 2021-10-06T05:54:44.427+0000 7fd703146700 1 ====== starting new request req=0x7fd8306b6620 ===== Oct 06 07:54:44 osd-1 bash[90888]: debug 2021-10-06T05:54:44.427+0000 7fd760a01700 1 ====== req done req=0x7fd8306b6620 op status=0 http_status=200 latency=0.000000000s ======
It looks to me that the following fragment of haproxy configuration is responsible for acting like this.
backend backend option httpchk HEAD / HTTP/1.0 # <-- here
I did a lot of digging and finally ended with this thread:
these sequences of packets including a RST are completely normal. This is the way that haproxy uses to do health checks efficiently. As soon as haproxy has discovered that the endpoint is up, there is no point in wasting any further resources at either end. It turns out that using TCP RST is the most efficient way for kernels at both ends of the connection to finish their conversation and free up those resources.
[...]
Again, the RST are completely normal. Don’t panic :wink:
- https://discourse.haproxy.org/t/connection-reset-seen-every-2-sec-haproxy/2156/5
This is how I came up with my summary of "keep calm and keep on ignoring".
full configuration¶
==> /var/lib/ceph/55633ec3-6c0c-4a02-990c-0f87e0f7a01f/haproxy.rgw.default.osd-1.urpnuu/haproxy/haproxy.cfg <== # This file is generated by cephadm. global log 127.0.0.1 local2 chroot /var/lib/haproxy pidfile /var/lib/haproxy/haproxy.pid maxconn 8000 daemon stats socket /var/lib/haproxy/stats defaults mode http log global option httplog option dontlognull option http-server-close option forwardfor except 127.0.0.0/8 option redispatch retries 3 timeout queue 20s timeout connect 5s timeout http-request 1s timeout http-keep-alive 5s timeout client 1s timeout server 1s timeout check 5s maxconn 8000 frontend stats mode http bind *:1967 stats enable stats uri /stats stats refresh 10s stats auth admin:<REDACTED> http-request use-service prometheus-exporter if { path /metrics } monitor-uri /health frontend frontend bind *:443 ssl crt /var/lib/haproxy/haproxy.pem default_backend backend backend backend option forwardfor balance static-rr option httpchk HEAD / HTTP/1.0 server rgw.default.osd-1.xqrjwp 172.16.62.10:8000 check weight 100 server rgw.default.osd-2.lopjij 172.16.62.11:8000 check weight 100 server rgw.default.osd-3.plbqka 172.16.62.12:8000 check weight 100 server rgw.default.osd-4.jvkhen 172.16.62.13:8000 check weight 100 server rgw.default.osd-5.hjxnrb 172.16.62.30:8000 check weight 100 server rgw.default.osd-6.bdrxdd 172.16.62.31:8000 check weight 100 ==> /var/lib/ceph/55633ec3-6c0c-4a02-990c-0f87e0f7a01f/keepalived.rgw.default.osd-1.vrjiew/keepalived.conf <== # This file is generated by cephadm. vrrp_script check_backend { script "/usr/bin/curl http://localhost:1967/health" weight -20 interval 2 rise 2 fall 2 } vrrp_instance VI_0 { state MASTER priority 100 interface bond0 virtual_router_id 51 advert_int 1 authentication { auth_type PASS auth_pass <REDACTED> } unicast_src_ip 172.16.62.10 unicast_peer { 172.16.62.11 172.16.62.12 172.16.62.13 172.16.62.30 172.16.62.31 } virtual_ipaddress { 172.16.62.26/19 dev bond0 } track_script { check_backend }
Updated by Sebastian Wagner about 2 years ago
- Project changed from 18 to Orchestrator
Updated by Sebastian Wagner about 2 years ago
You were pretty successful in hiding this issue from the cephadm developers by putting it into the ceph-deploy project :-)
Updated by Redouane Kachach Elhichou almost 2 years ago
Thanks for the detailed doc. It seems like it's a normall haproxy behavior and no change is needed at cephadm level. Closing.
Updated by Redouane Kachach Elhichou almost 2 years ago
- Status changed from New to Closed