[BUG] Allocated ports on load balancer (using SNAT) are not released when the connection is closed #4697

daconstenla · 2024-12-11T10:14:03Z

Describe the bug

Since around November 18th (but varies depending on the environment; we have 3) we have noticed that short lived connections, that are opened and closed, to a postgres flexible server publicly exposed, are leaving SNAT ports allocated in the load balancer (named: kubernetes handling the aksOutboundRule), after multiple iterations and before the idle timeout goes by, the number of available ports goes down to 0, effectively dropping new connections that require new port allocation.

To Reproduce

For us the problem is happening when using runatlantis/atlantis to run terraform to configure postgres flexible using cyrilgdn/terraform-provider-postgresql.

We are configuring in 1 database as example:

21 postgresql_role
21 postgresql_schema
21 postgresql_grant
63 postgresql_grant_role

that means around 126 connections that will be open and close (no connection pool) to the same HOST and PORT which lead to many allocated ports in the load balancer.

Note

the setup was there before we started noticing the actual problem, nothing remarkable from our setup changed around detection date.

Expected behavior

We'd expect that a connection that's effectively opened and closed will not leave an allocated port.

Screenshots

Last month of allocated SNAT ports on the public load balanacer kubernetes on 3 different environments:

Environment (please complete the following information):

az cli 2.67.0
kubectl Version v1.30.6
Kubernetes Server Version: v1.30.6 and v1.30.5
network configuration: Azure CNI Node Subnet
Network policy: Calico
istio 1.24.1 (full mesh and ambient mode)
IPv4 only
single public IP per load balancer
default allocated outbound ports (per instance)

Additional context

We have mitigated the problem by:

increasing the allocated outbound ports (per instance)
attaching more IPs to the load balancer(s)

We have verified that connections are opened and closed inspired by the following Troubleshoot doc https://learn.microsoft.com/en-us/troubleshoot/azure/azure-kubernetes/connectivity/snat-port-exhaustion:

N.con  state    binary           ip
325    connect  terraform-provi  <IP-flexible-server>
325    close    terraform-provi  <IP-flexible-server>

also used netstat in the atlantis pod to verify connections are open and closed.

The text was updated successfully, but these errors were encountered:

daconstenla added the bug label Dec 11, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] Allocated ports on load balancer (using SNAT) are not released when the connection is closed #4697

[BUG] Allocated ports on load balancer (using SNAT) are not released when the connection is closed #4697

daconstenla commented Dec 11, 2024 •

edited

Loading

[BUG] Allocated ports on load balancer (using SNAT) are not released when the connection is closed #4697

[BUG] Allocated ports on load balancer (using SNAT) are not released when the connection is closed #4697

Comments

daconstenla commented Dec 11, 2024 • edited Loading

daconstenla commented Dec 11, 2024 •

edited

Loading