-
Notifications
You must be signed in to change notification settings - Fork 314
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Advanced networking pods getting wrong IPs #533
Comments
I've confirmed this same issue when creating an advanced networking cluster using the Azure CLI 2.0.41. |
Just ran into this issue and similar symptoms when using CLI to create the cluster. Was support able to triage the root cause of this issue @mbrancato? |
@sukrit007 This is an issue related to the pod-cidr value and network-plugin values. Make sure you set The Terraform issue here is being handled by hashicorp/terraform-provider-azurerm#1434 |
Thanks @mbrancato that worked like a charm! |
Closing the issue itself as see resolution, @mbrancato please re-open if this is not the case |
We've identified an issue when using the advanced networking and custom Vnet that pods are getting assigned the wrong IPs. This causes a number of issues in the cluster when deploying apps.
This is likely deployment specific as we have deployed with the Portal and seen it work correctly, but with Terraform we're seeing these issues. We're currently working thru a support request to identify a cause.
This seems related to other issues, or at least has presented the same symptoms (outside the IP issue). For example we see several pods in
kube-system
in crashloops, and can't see logs from pods. Having seen others referencing scaling down to one node (#232 (comment)), I tried something similar and this is a workaround. I drained all but one node, and things work. I then uncordoned other nodes to deploy apps. We can see logs now, however, I'm confident that ifazureproxy
,tunnelfront
and others move to other nodes, things will start failing again.This definitely appears to be a deployment issue using Terraform. I wanted to at least document this here, but I'm unsure if the issue is in the Terraform provider or something that has changed in the AKS APIs, etc. Thoughts?
To see whats happening, using the Advanced networking and custom vnet, we see pods on the 10.244.x.x network:
References:
#2
#232
#56
The text was updated successfully, but these errors were encountered: