Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG] IP address leak if adding EIP fails #4922

Closed
cruickshankpg opened this issue Jan 13, 2025 · 4 comments
Closed

[BUG] IP address leak if adding EIP fails #4922

cruickshankpg opened this issue Jan 13, 2025 · 4 comments
Labels
bug Something isn't working eip ipam

Comments

@cruickshankpg
Copy link

Kube-OVN Version

v1.12.22

Kubernetes Version

v1.28.6

Operation-system/Kernel Version

"Ubuntu 22.04.5 LTS" 6.8.0-47-generic

Description

If an EIP fails to be added to a NAT gateway it allocates an IP. If the EIP is deleted before the NAT gateway comes up then the IP address allocation is leaked in IPAM. Restarting the kube-ovn-controller leader removes the leak

0110 16:34:03.854233       7 vpc_nat_gw_eip.go:208] handle add iptables eip poison-eip
I0110 16:34:03.854278       7 ipam.go:60] allocate v4 10.146.46.215, v6 , mac 72:b4:28:77:2c:b0 for poison-eip from subnet ovn-vpc-external-network
E0110 16:34:03.854395       7 vpc_nat_gw_eip.go:425] pod.apps "doesn-not-exist" not found
E0110 16:34:03.854405       7 vpc_nat_gw_eip.go:247] failed to create eip 'poison-eip' in pod, pod.apps "doesn-not-exist" not found
E0110 16:34:03.854470       7 vpc_nat_gw_eip.go:110] error syncing 'poison-eip': pod.apps "doesn-not-exist" not found, requeuing

The IP address is added to the external subnet's status' v4usingIPrange and removed from the the v4availableIPrange but v4usingIPs is not updated.

  v4availableIPrange: 10.146.46.204-10.146.46.211,10.146.46.217-10.146.46.254
  v4availableIPs: 48
  v4usingIPrange: 10.146.46.212-10.146.46.216
  v4usingIPs: 3

Steps To Reproduce

Create an eip

cat <<EOF | kubectl apply -f -
apiVersion: kubeovn.io/v1
kind: IptablesEIP
metadata:
  name: poison-eip
spec:
  natGwDp: does-not-exist
  externalSubnet: ovn-vpc-external-network
EOF

Wait a moment for it to get programmed

Delete the eip

kubectl delete eip poison-eip

Inspect the subnet status

Current Behavior

EIP IP allocation leaked

Expected Behavior

EIP IP allocation not leaked

@cruickshankpg cruickshankpg added the bug Something isn't working label Jan 13, 2025
Copy link

dosubot bot commented Jan 13, 2025

Hey @cruickshankpg, I'm here to assist you with this issue. Let's work together to get this sorted out.

The issue you're encountering seems to be related to the IP allocation process in kube-ovn when an EIP fails to be added to a non-existent NAT gateway. The IP is allocated but not properly deallocated when the EIP is deleted, leading to a leak in the IPAM system.

In kube-ovn, the function handleAddIptablesEip is responsible for handling the addition of an EIP. If the EIP creation fails, such as when the specified NAT gateway does not exist, the error is logged, and the process is requeued. However, it seems that the IP allocation is not being rolled back or cleaned up in this failure scenario, which results in the IP leak you observed.

Restarting the kube-ovn-controller appears to resolve the leak, likely because it triggers a reconciliation process that corrects the IPAM state. To address this issue more permanently, the error handling in the EIP creation process should include logic to release the allocated IP if the EIP cannot be successfully created.

To resolve this, you might consider implementing or suggesting a change in the kube-ovn codebase to ensure that IPs are deallocated when an EIP creation fails. This could involve adding cleanup logic in the error handling paths of the handleAddIptablesEip function or related functions responsible for managing EIP lifecycle events.

To continue talking to Dosu, mention @dosu.


Help Dosu learn! Give it feedback: Great Response | Irrelevant Answer | Incorrect Sources | Too Verbose | Hallucination | Other

@cruickshankpg
Copy link
Author

I think we should just call c.ipam.ReleaseAddressByPod(key, cachedEip.Spec.ExternalSubnet) if there's an error after acquiring the IP address in handleAddIptablesEip

@zbb88888
Copy link
Collaborator

@cruickshankpg Thanks for your info, how about make sure vpc nat gw pod is ready befor creating eip

@zbb88888
Copy link
Collaborator

@cruickshankpg release-1.12 already fixed this, please try the latest. thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working eip ipam
Projects
None yet
Development

No branches or pull requests

2 participants