Skip to content
This repository has been archived by the owner on Oct 12, 2023. It is now read-only.

[Azure Firewall + add-pod-identity] watching problem #467

Closed
ferantivero opened this issue Jan 2, 2020 · 6 comments · Fixed by #488
Closed

[Azure Firewall + add-pod-identity] watching problem #467

ferantivero opened this issue Jan 2, 2020 · 6 comments · Fixed by #488
Labels
bug Something isn't working

Comments

@ferantivero
Copy link
Contributor

ferantivero commented Jan 2, 2020

Describe the bug

Symptom

when restricting egress traffic in an aks cluster using Azure Firewall, aad pod identity starts failing while trying to watch.

Azure Firewall logs

the following is the only denied HTTPS request:

"msg":"HTTPS request from 10.10.0.4:45140. Action: Deny. Reason: SNI TLS extension was missing."

from user's voice: https://feedback.azure.com/forums/217313-networking/suggestions/38623357-disable-sni-tls-extension-check-on-azure-firewall

nmi pod

kubectl get po nmi-szsqt -o wide

NAME        READY   STATUS    RESTARTS   AGE     IP          NODE                                NOMINATED NODE   READINESS GATES
nmi-szsqt   1/1     Running   0          3h51m   10.10.0.4   aks-agentpool-84518513-vmss000000   <none>           <none>

kubernetes svc

kubectl get svc kubernetes

NAME         TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
kubernetes   ClusterIP   10.2.0.1     <none>        443/TCP   13d

nmi logs

E0102 19:32:10.284772       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:10.285493       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:10.286265       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:21.287400       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:21.287678       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:21.288921       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:32.289224       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:32.291420       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:32.292354       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:43.291544       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:43.293173       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:43.294368       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:54.293602       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:54.295845       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:32:54.297587       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:05.296130       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:05.299047       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:05.300861       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:16.298310       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:16.300803       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:16.302125       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:27.300256       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:27.304225       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:27.305508       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:32.576552       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: EOF
E0102 19:33:32.576630       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: EOF
E0102 19:33:32.576695       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: EOF
E0102 19:33:43.578621       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:43.579527       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:43.580454       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: net/http: TLS handshake timeout
E0102 19:33:52.364517       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzureAssignedIdentity: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azureassignedidentities?limit=500&resourceVersion=0: EOF
E0102 19:33:52.364594       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.Pod: Get https://10.2.0.1:443/api/v1/pods?fieldSelector=spec.nodeName%3Daks-agentpool-84518513-vmss000000&limit=500&resourceVersion=0: EOF
E0102 19:33:52.364623       1 reflector.go:205] pkg/mod/k8s.io/[email protected]/tools/cache/reflector.go:99: Failed to list *v1.AzurePodIdentityException: Get https://10.2.0.1:443/apis/aadpodidentity.k8s.io/v1/azurepodidentityexceptions?limit=500&resourceVersion=0: EOF

Workarounds

Although it isn't valid in our configuration scenario (production), please let us share a temporary workaround:

  1. allow outbound traffic from the aks subnet cidr to the K8s API Server Public Ip address on 443.

Note: the problem with this workaround is that as it is confirmed here these IP(s) are not persistent.

  1. another possible workaround could be to start using https://docs.microsoft.com/en-us/azure/aks/private-clusters but it is currently in preview

Steps To Reproduce

  1. create aks cluster with azure advanced networking
  2. restrict traffic following instructions here + rt.services.visualstudio.com:443 allowing as source your aks cluster subnet cidr only
  3. install add-pod-identity

Expected behavior

no errors should be displayed from logs

AAD Pod Identity version

mcr.microsoft.com/k8s/aad-pod-identity/nmi:1.5.4
mcr.microsoft.com/k8s/aad-pod-identity/mic:1.5.4

Kubernetes version

client Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.0", GitCommit:"e8462b5b5dc2584fdcd18e6bcfe9f1e4d970a529", GitTreeState:"clean", BuildDate:"2019-06-19T16:40:16Z", GoVersion:"go1.12.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"15", GitVersion:"v1.15.5", GitCommit:"2640ac46d96791a135961425127d9c2f7e184924", GitTreeState:"clean", BuildDate:"2019-11-14T04:58:54Z", GoVersion:"go1.12.10", Compiler:"gc", Platform:"linux/amd64"}

Additional context

EDIT: found this upstream in case it helps kubernetes/client-go#173

EDIT1: just a bit more context on the Azure Firewall SNI Deny log entry, since I'm not adding the ASK Api Server Public Ip from my Network Rules (no rules match found), it proceeds to Application Rules, and under that set collection of rules (FQDN), SNI extension could be required.

@ferantivero ferantivero added the bug Something isn't working label Jan 2, 2020
@ferantivero ferantivero changed the title [Azure Firewall + add-pod-identity] watch problem [Azure Firewall + add-pod-identity] watching problem Jan 3, 2020
@sbkg0002
Copy link

We experience the exact same thing. Looking for a production solution.

@pdebruin
Copy link

Both egress lockdown as well as aad pod identity are key for enterprise scenarios. They should also work together. /Cc @ritazh

@aramase
Copy link
Member

aramase commented Jan 10, 2020

Thank you for opening the issue. We will investigate this and update the issue.

@sbkg0002
Copy link

Hi @aramase thanks. If you need additional information on the issue, please let us know.

@aramase
Copy link
Member

aramase commented Jan 22, 2020

@sbkg0002 @ferantivero I was able to recreate the issue. There are 2 options to mitigate this -

  1. Deploy pod-identity in kube-system namespace. This issue was fixed earlier where components in kube-system namespace were automatically injected with the KUBERNETES_SERVICE_HOST=<your-fqdn-prefix>.hcp.<region>.azmk8s.io
  2. If deploying pod-identity in any other namespace, then need to add the following env var to the mic and nmi deployment -
        env:
          - name: KUBERNETES_SERVICE_HOST
            value: <your-fqdn-prefix>.hcp.<region>.azmk8s.io

This should resolve the issue. Let me know if this helps. I'll update the docs soon.

@ferantivero
Copy link
Contributor Author

ferantivero commented Jan 22, 2020

thanks for sharing this info @aramase, and glad to see you were able to reprod this.

The mitigation 1 is the kind of solution I was looking for. I've just created a PR #488 with changes I made to get it working on my environment.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants