-
Notifications
You must be signed in to change notification settings - Fork 888
Description
Detailed Description of the Problem
A significant TLS performance degradation is observed when migrating from OpenShift 4.12 to 4.17.
- HAProxy versions involved:
- OpenShift 4.12: HAProxy
2.2.24 - OpenShift 4.17: HAProxy
2.8.10
- OpenShift 4.12: HAProxy
- TLS key:
rsa:4096. - Performance testing tool:
vegeta. - HAProxy access logs show:
SSL handshake failure. - TLS termination occurs at the edge (frontend). No re-encryption between HAProxy and backend servers.
Vegeta runs:
$ echo "GET https://hello-world.com/" | vegeta attack -max-workers 5000 -rate 0 -keepalive=true --duration=3m -insecure | tee results.bin | vegeta report | head
Requests [total, rate, throughput] 303611, 1686.65, 970.91
Duration [total, attack, wait] 3m3s, 3m0s, 3.383s
Latencies [min, mean, 50, 90, 95, 99, max] 1.472ms, 2.898s, 4.13s, 5.226s, 5.403s, 5.843s, 21.245s
Bytes In [total, mean] 9615078, 31.67
Bytes Out [total, mean] 0, 0.00
Success [ratio] 58.65%
Status Codes [code:count] 0:125554 200:178057
Error Set:
Get "https://hello-world.com/": EOF
$ echo "GET https://hello-world.com/" | vegeta attack -max-workers 5000 -rate 0 -keepalive=true --duration=3m -insecure | tee results.bin | vegeta report | head
Requests [total, rate, throughput] 209729, 1164.72, 908.89
Duration [total, attack, wait] 3m3s, 3m0s, 2.976s
Latencies [min, mean, 50, 90, 95, 99, max] 1.157ms, 4.27s, 5.086s, 6.367s, 6.787s, 7.464s, 23.773s
Bytes In [total, mean] 8983872, 42.84
Bytes Out [total, mean] 0, 0.00
Success [ratio] 79.33%
Status Codes [code:count] 0:43361 200:166368
Error Set:
Get "https://hello-world.com/": EOF
Get "https://hello-world.com/": dial tcp 0.0.0.0:0->10.0.50.224:443: connect: connection refused
$ echo "GET https://hello-world.com/" | vegeta attack -max-workers 5000 -rate 0 -keepalive=true --duration=3m -insecure | tee results.bin | vegeta report | head
Requests [total, rate, throughput] 328537, 1825.15, 941.73
Duration [total, attack, wait] 3m3s, 3m0s, 3.429s
Latencies [min, mean, 50, 90, 95, 99, max] 1.23ms, 2.584s, 2.658s, 5.051s, 5.259s, 5.647s, 29.6s
Bytes In [total, mean] 9328284, 28.39
Bytes Out [total, mean] 0, 0.00
Success [ratio] 52.58%
Status Codes [code:count] 0:155791 200:172746
Error Set:
Get "https://hello-world.com/": EOF
Get "https://hello-world.com/": dial tcp 0.0.0.0:0->10.0.50.224:443: connect: connection refusedHAProxy 2.8.10 access logs:
haproxy[31]: 10.0.11.73:59679 [17/Sep/2025:07:46:55.416] fe_sni~ be_edge_http:test:myedge/pod:web-server-6fd7fc6444-w9ln5:service-unsecure:http:10.128.2.17:8080 0/0/37/25/62 200 253 - - --NI 25842/24728/51/51/0 0/0 "GET / HTTP/1.1"
haproxy[31]: 10.0.11.73:36407 [17/Sep/2025:07:46:55.416] fe_sni~ be_edge_http:test:myedge/pod:web-server-6fd7fc6444-w9ln5:service-unsecure:http:10.128.2.17:8080 0/0/37/25/62 200 253 - - --NI 25841/24727/50/50/0 0/0 "GET / HTTP/1.1"
haproxy[31]: 10.0.11.73:60273 [17/Sep/2025:07:46:55.416] fe_sni~ be_edge_http:test:myedge/pod:web-server-6fd7fc6444-w9ln5:service-unsecure:http:10.128.2.17:8080 0/0/37/25/62 200 253 - - --NI 25841/24727/49/49/0 0/0 "GET / HTTP/1.1"
haproxy[31]: 10.0.11.73:43247 [17/Sep/2025:07:46:25.507] fe_sni/1: SSL handshake failure
haproxy[31]: 10.0.11.73:57489 [17/Sep/2025:07:46:25.499] fe_sni/1: SSL handshake failure
haproxy[31]: 10.0.11.73:59035 [17/Sep/2025:07:46:54.457] public_ssl be_sni/fe_sni 1/32/1047 0 sD 25853/1120/1063/1063/0 0/0
haproxy[31]: 10.0.11.73:40352 [17/Sep/2025:07:46:54.490] public_ssl be_sni/fe_sni 1/0/1014 0 sD 25852/1119/1062/1062/0 0/0
haproxy[31]: 10.0.11.73:38208 [17/Sep/2025:07:46:54.490] public_ssl be_sni/fe_sni 1/0/1014 0 sD 25851/1118/1061/1061/0 0/0
HAProxy process shows excessive CPU usage (~350% on 4 cores node):

show tasks shows almost exclusive CPU assignment to ssl_sock_io_cb function:
Running tasks: 24406 (4 threads)
function places % lat_tot lat_avg
ssl_sock_io_cb 24285 99.5 - -
sc_conn_io_cb 70 0.2 - -
h1_io_cb 39 0.1 - -
process_stream 8 0.0 - -
h1_timeout_task 4 0.0 - -
Running tasks: 24369 (4 threads)
function places % lat_tot lat_avg
ssl_sock_io_cb 23941 98.2 - -
sc_conn_io_cb 224 0.9 - -
process_stream 88 0.3 - -
h1_io_cb 83 0.3 - -
h1_timeout_task 16 0.0 - -
session_expire_embryonic 8 0.0 - -
accept_queue_process 4 0.0 - -
other 3 0.0 - -
xprt_handshake_io_cb 2 0.0 - -
Running tasks: 24185 (4 threads)
function places % lat_tot lat_avg
ssl_sock_io_cb 23884 98.7 - -
sc_conn_io_cb 165 0.6 - -
h1_io_cb 69 0.2 - -
process_stream 62 0.2 - -
h1_timeout_task 4 0.0 - -
xprt_handshake_io_cb 1 0.0 - -
We think that the following issues were showing similar problems:
- Performance regression for TLS / Re-encrypt traffic post 2.4-dev11 #1914
- SSL Certificate SHA512 #1988
As with #1914 we managed to narrow the problem down to 4c48edba4f45bb78f41af7d79d3c176710fe6a90 which first appeared in 2.4.-dev12 release. Reverting the commit results in:
- 100% success rate (no SSL handshake failures)
- Improved request rate (~11K vs ~2K)
- Reduced CPU usage
Vegeta runs with 4c48edba4f45bb78f41af7d79d3c176710fe6a90 reverted:
$ echo "GET https://hello-world.com/" | vegeta attack -max-workers 5000 -rate 0 -keepalive=true --duration=3m -insecure | tee results.bin | vegeta report | head
Requests [total, rate, throughput] 2054995, 11414.31, 11405.44
Duration [total, attack, wait] 3m0s, 3m0s, 140.122ms
Latencies [min, mean, 50, 90, 95, 99, max] 1.898ms, 304.647ms, 284.05ms, 380.766ms, 411.503ms, 473.042ms, 12.395s
Bytes In [total, mean] 110969730, 54.00
Bytes Out [total, mean] 0, 0.00
Success [ratio] 100.00%
Status Codes [code:count] 200:2054995
Error Set:
$ echo "GET https://hello-world.com/" | vegeta attack -max-workers 5000 -rate 0 -keepalive=true --duration=3m -insecure | tee results.bin | vegeta report | head
Requests [total, rate, throughput] 2047009, 11372.09, 11368.99
Duration [total, attack, wait] 3m0s, 3m0s, 49.098ms
Latencies [min, mean, 50, 90, 95, 99, max] 674.292µs, 287.529ms, 262.527ms, 366.743ms, 398.454ms, 475.019ms, 14.126s
Bytes In [total, mean] 110538486, 54.00
Bytes Out [total, mean] 0, 0.00
Success [ratio] 100.00%
Status Codes [code:count] 200:2047009
Error Set:
show tasks (number of task is noticeably less though):
Running tasks: 5005 (4 threads)
function places % lat_tot lat_avg
ssl_sock_io_cb 4819 96.2 - -
h1_io_cb 186 3.7 - -
Running tasks: 0 (4 threads)
function places % lat_tot lat_avg
Running tasks: 1 (4 threads)
function places % lat_tot lat_avg
h1_io_cb 1 100.0 - -
Running tasks: 1 (4 threads)
function places % lat_tot lat_avg
h1_io_cb 1 100.0 - -
Running tasks: 2 (4 threads)
function places % lat_tot lat_avg
ssl_sock_io_cb 2 100.0 - -
Running tasks: 3 (4 threads)
function places % lat_tot lat_avg
sc_conn_io_cb 2 66.6 - -
process_stream 1 33.3 - -
Running tasks: 3 (4 threads)
function places % lat_tot lat_avg
sc_conn_io_cb 2 66.6 - -
process_stream 1 33.3 - -
Running tasks: 2 (4 threads)
function places % lat_tot lat_avg
h1_io_cb 1 50.0 - -
ssl_sock_io_cb 1 50.0 - -
Expected Behavior
No SSL handshake failures and similar request and success rates when a 4K RSA key is used migrating from HAProxy 2.2.z to >2.4.
Steps to Reproduce the Behavior
- Use HAProxy
2.8.10or any version starting from2.4.0. - Use 4K RSA key for the server certificate.
- Terminate TLS on the frontend, don't use TLS between HAProxy and server.
- Number of threads: 4.
- Use
vegetato test the performance (5000workers, maximum hit rate). - Check the success rate from
vegetaand access logs in HAProxy.
Do you have any idea what may have caused this?
Reverting 4c48edba4f45bb78f41af7d79d3c176710fe6a90 on HAProxy 2.8.10 helps to get TLS performance similar to 2.2.
Do you have an idea how to solve the issue?
No response
What is your configuration?
global
no strict-limits
maxconn 50000
nbthread 4
daemon
log /var/lib/rsyslog/rsyslog.sock len 1024 local1 info
log-send-hostname
ca-base /etc/ssl
crt-base /etc/ssl
stats socket /var/lib/haproxy/run/haproxy.sock mode 600 level admin expose-fd listeners
stats timeout 2m
tune.maxrewrite 8192
tune.bufsize 32768
ssl-default-bind-options ssl-min-ver TLSv1.2
tune.ssl.default-dh-param 2048
ssl-default-bind-ciphers REDACTED
ssl-default-bind-ciphersuites REDACTED
defaults
maxconn 50000
option httplog
log global
errorfile 503 /var/lib/haproxy/conf/error-page-503.http
errorfile 404 /var/lib/haproxy/conf/error-page-404.http
timeout connect 5s
timeout client 30s
timeout client-fin 1s
timeout server 30s
timeout server-fin 1s
timeout http-request 10s
timeout http-keep-alive 300s
timeout tunnel 1h
frontend public_ssl
option tcplog
bind :443 accept-proxy
tcp-request inspect-delay 5s
tcp-request content accept if { req_ssl_hello_type 1 }
acl sni req.ssl_sni -m found
acl sni_passthrough req.ssl_sni,lower,map_reg(/var/lib/haproxy/conf/os_sni_passthrough.map) -m found
use_backend %[req.ssl_sni,lower,map_reg(/var/lib/haproxy/conf/os_tcp_be.map)] if sni sni_passthrough
use_backend be_sni if sni
default_backend be_no_sni
backend be_sni
server fe_sni unix@/var/lib/haproxy/run/haproxy-sni.sock weight 1 send-proxy
frontend fe_sni
bind unix@/var/lib/haproxy/run/haproxy-sni.sock ssl crt /var/lib/haproxy/router/certs/default.pem crt-list /var/lib/haproxy/conf/cert_config.map accept-proxy no-alpn
mode http
option idle-close-on-response
http-request deny if { hdr_len(content-length) 0 }
http-request del-header Proxy
http-request set-header Host %[req.hdr(Host),lower]
use_backend %[base,map_reg(/var/lib/haproxy/conf/os_edge_reencrypt_be.map)]
default_backend openshift_default
backend be_edge_http:test:myedge
mode http
option redispatch
option forwardfor
balance random
timeout check 5000ms
http-request add-header X-Forwarded-Host %[req.hdr(host)]
http-request add-header X-Forwarded-Port %[dst_port]
http-request add-header X-Forwarded-Proto http if !{ ssl_fc }
http-request add-header X-Forwarded-Proto https if { ssl_fc }
http-request add-header X-Forwarded-Proto-Version h2 if { ssl_fc_alpn -i h2 }
http-request add-header Forwarded for=%[src];host=%[req.hdr(host)];proto=%[req.hdr(X-Forwarded-Proto)]
cookie 851b234ae52ef7709dfb5e5fadb97e8c insert indirect nocache httponly secure attr SameSite=None
server pod:web-server-6fd7fc6444-w9ln5:service-unsecure:http:10.128.2.17:8080 10.128.2.17:8080 cookie d8f71950c0e59754be3f646cea5d4fce weight 1Output of haproxy -vv
$ haproxy -vv
HAProxy version 2.8.10-f28885f 2024/06/14 - https://haproxy.org/
Status: long-term supported branch - will stop receiving fixes around Q2 2028.
Known bugs: http://www.haproxy.org/bugs/bugs-2.8.10.html
Running on: Linux 5.14.0-427.86.1.el9_4.x86_64 #1 SMP PREEMPT_DYNAMIC Wed Aug 27 07:03:43 EDT 2025 x86_64
Build options :
TARGET = linux-glibc
CPU = generic
CC = cc
CFLAGS = -O2 -g -Wall -Wextra -Wundef -Wdeclaration-after-statement -Wfatal-errors -Wtype-limits -Wshift-negative-value -Wshift-overflow=2 -Wduplicated-cond -Wnull-dereference -fwrapv -Wno-address-of-packed-member -Wno-unused-label -Wno-sign-compare -Wno-unused-parameter -Wno-clobbered -Wno-missing-field-initializers -Wno-cast-function-type -Wno-string-plus-int -Wno-atomic-alignment
OPTIONS = USE_LINUX_TPROXY=1 USE_CRYPT_H=1 USE_GETADDRINFO=1 USE_OPENSSL=1 USE_ZLIB=1 USE_PCRE=1
DEBUG = -DDEBUG_STRICT -DDEBUG_MEMORY_POOLS
Feature list : -51DEGREES +ACCEPT4 +BACKTRACE -CLOSEFROM +CPU_AFFINITY +CRYPT_H -DEVICEATLAS +DL -ENGINE +EPOLL -EVPORTS +GETADDRINFO -KQUEUE -LIBATOMIC +LIBCRYPT +LINUX_CAP +LINUX_SPLICE +LINUX_TPROXY -LUA -MATH -MEMORY_PROFILING +NETFILTER +NS -OBSOLETE_LINKER +OPENSSL -OPENSSL_WOLFSSL -OT +PCRE -PCRE2 -PCRE2_JIT -PCRE_JIT +POLL +PRCTL -PROCCTL -PROMEX -PTHREAD_EMULATION -QUIC -QUIC_OPENSSL_COMPAT +RT +SHM_OPEN -SLZ +SSL -STATIC_PCRE -STATIC_PCRE2 -SYSTEMD +TFO +THREAD +THREAD_DUMP +TPROXY -WURFL +ZLIB
Default settings :
bufsize = 16384, maxrewrite = 1024, maxpollevents = 200
Built with multi-threading support (MAX_TGROUPS=16, MAX_THREADS=256, default=4).
Built with OpenSSL version : OpenSSL 3.0.7 1 Nov 2022
Running on OpenSSL version : OpenSSL 3.0.7 1 Nov 2022
OpenSSL library supports TLS extensions : yes
OpenSSL library supports SNI : yes
OpenSSL library supports : TLSv1.0 TLSv1.1 TLSv1.2 TLSv1.3
OpenSSL providers loaded : default
Built with network namespace support.
Built with zlib version : 1.2.11
Running on zlib version : 1.2.11
Compression algorithms supported : identity("identity"), deflate("deflate"), raw-deflate("deflate"), gzip("gzip")
Built with transparent proxy support using: IP_TRANSPARENT IPV6_TRANSPARENT IP_FREEBIND
Built with PCRE version : 8.44 2020-02-12
Running on PCRE version : 8.44 2020-02-12
PCRE library supports JIT : no (USE_PCRE_JIT not set)
Encrypted password support via crypt(3): yes
Built with gcc compiler version 11.4.1 20231218 (Red Hat 11.4.1-3)
Available polling systems :
epoll : pref=300, test result OK
poll : pref=200, test result OK
select : pref=150, test result OK
Total: 3 (3 usable), will use epoll.
Available multiplexer protocols :
(protocols marked as <default> cannot be specified using 'proto' keyword)
h2 : mode=HTTP side=FE|BE mux=H2 flags=HTX|HOL_RISK|NO_UPG
fcgi : mode=HTTP side=BE mux=FCGI flags=HTX|HOL_RISK|NO_UPG
h1 : mode=HTTP side=FE|BE mux=H1 flags=HTX|NO_UPG
<default> : mode=HTTP side=FE|BE mux=H1 flags=HTX
none : mode=TCP side=FE|BE mux=PASS flags=NO_UPG
<default> : mode=TCP side=FE|BE mux=PASS flags=
Available services : none
Available filters :
[BWLIM] bwlim-in
[BWLIM] bwlim-out
[CACHE] cache
[COMP] compression
[FCGI] fcgi-app
[SPOE] spoe
[TRACE] trace
Last Outputs and Backtraces
Additional Information
vegeta tests with the CPU monitoring were done against 2.3.21 and 2.4.0 both compiled with OpenSSL1.1. 2.4.0 shows CPU trends similar to 2.8.10 and gets better TLS performance when 4c48edba4f45bb78f41af7d79d3c176710fe6a90 is reverted. 2.3.21 does not show TLS performance problems with 4K RSA key certificate. That's how we narrowed down to 2.4.-dev12 being problematic.
