Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AKS: Fix dynamic reconfiguration of bridge mode #10383

Merged
merged 1 commit into from
Feb 28, 2020

Conversation

tgraf
Copy link
Member

@tgraf tgraf commented Feb 28, 2020

Commit 0b70117 ("doc: Fix AKS guide regression") has re-introduced the
dynamic reconfiguration of the Azure bridge into transport mode in order to
enable transparent proxy operations. The commit has incorrectly done so by
adding the reconfiguration step in the preStop instead of the postStart hook.
This required the Cilium pod to restart once in order to reconfigure the bridge
and thus delayed the bootstrapping time.

Fixes: 0b70117 ("doc: Fix AKS guide regression") has re-introduced the


This change is Reviewable

Commit 0b70117 ("doc: Fix AKS guide regression") has re-introduced the
dynamic reconfiguration of the Azure bridge into transport mode in order to
enable transparent proxy operations. The commit has incorrectly done so by
adding the reconfiguration step in the preStop instead of the postStart hook.
This required the Cilium pod to restart once in order to reconfigure the bridge
and thus delayed the bootstrapping time.

Fixes: 0b70117 ("doc: Fix AKS guide regression") has re-introduced the

Signed-off-by: Thomas Graf <[email protected]>
@tgraf tgraf added kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium. needs-backport/1.6 integration/cloud Related to integration with cloud environments such as AKS, EKS, GKE, etc. labels Feb 28, 2020
@tgraf tgraf requested a review from a team February 28, 2020 13:39
@coveralls
Copy link

Coverage Status

Coverage increased (+0.02%) to 45.59% when pulling 9fe5e25 on pr/tgraf/aks-guide-followup-fix into 26e3ca0 on master.

@tgraf
Copy link
Member Author

tgraf commented Feb 28, 2020

Validation of the fix:

export RESOURCE_GROUP_NAME=group1
export CLUSTER_NAME=aks-test1
export LOCATION=westus
az group create --name $RESOURCE_GROUP_NAME --location $LOCATION
az aks create \
    --resource-group $RESOURCE_GROUP_NAME \
    --name $CLUSTER_NAME \
    --node-count 2 \
    --generate-ssh-keys \
    --network-plugin azure
az aks get-credentials --resource-group $RESOURCE_GROUP_NAME --name $CLUSTER_NAME
kubectl get nodes
NAME                                STATUS   ROLES   AGE     VERSION
aks-nodepool1-41608469-vmss000000   Ready    agent   3m20s   v1.14.8
aks-nodepool1-41608469-vmss000001   Ready    agent   3m18s   v1.14.8
kubectl create namespace cilium
kubectl apply -f chaining.yaml
k -n cilium get pods -o wide --watch
NAME                               READY   STATUS     RESTARTS   AGE   IP            NODE                                NOMINATED NODE   READINESS GATES
cilium-6lcnj                       0/1     Init:0/2   0          15s   10.240.0.35   aks-nodepool1-41608469-vmss000001   <none>           <none>
cilium-node-init-bsqrg             1/1     Running    0          15s   10.240.0.4    aks-nodepool1-41608469-vmss000000   <none>           <none>
cilium-node-init-n7fz7             1/1     Running    0          15s   10.240.0.35   aks-nodepool1-41608469-vmss000001   <none>           <none>
cilium-operator-855c8b79c5-zv69n   1/1     Running    0          15s   10.240.0.35   aks-nodepool1-41608469-vmss000001   <none>           <none>
cilium-sq59k                       0/1     Init:0/2   0          15s   10.240.0.4    aks-nodepool1-41608469-vmss000000   <none>           <none>
NAME                               AGE
cilium-sq59k                       35s
cilium-6lcnj                       36s
cilium-sq59k                       37s
cilium-6lcnj                       38s
cilium-sq59k                       39s
cilium-6lcnj                       40s
k -n cilium get pods -o wide --watch
NAME                               READY   STATUS    RESTARTS   AGE   IP            NODE                                NOMINATED NODE   READINESS GATES
cilium-6lcnj                       0/1     Running   0          45s   10.240.0.35   aks-nodepool1-41608469-vmss000001   <none>           <none>
cilium-node-init-bsqrg             1/1     Running   0          45s   10.240.0.4    aks-nodepool1-41608469-vmss000000   <none>           <none>
cilium-node-init-n7fz7             1/1     Running   0          45s   10.240.0.35   aks-nodepool1-41608469-vmss000001   <none>           <none>
cilium-operator-855c8b79c5-zv69n   1/1     Running   0          45s   10.240.0.35   aks-nodepool1-41608469-vmss000001   <none>           <none>
cilium-sq59k                       0/1     Running   0          45s   10.240.0.4    aks-nodepool1-41608469-vmss000000   <none>           <none>
k -n cilium get pods -o wide
NAME                               READY   STATUS    RESTARTS   AGE   IP            NODE                                NOMINATED NODE   READINESS GATES
cilium-6lcnj                       1/1     Running   0          82s   10.240.0.35   aks-nodepool1-41608469-vmss000001   <none>           <none>
cilium-node-init-bsqrg             1/1     Running   0          82s   10.240.0.4    aks-nodepool1-41608469-vmss000000   <none>           <none>
cilium-node-init-n7fz7             1/1     Running   0          82s   10.240.0.35   aks-nodepool1-41608469-vmss000001   <none>           <none>
cilium-operator-855c8b79c5-zv69n   1/1     Running   0          82s   10.240.0.35   aks-nodepool1-41608469-vmss000001   <none>           <none>
cilium-sq59k                       1/1     Running   0          82s   10.240.0.4    aks-nodepool1-41608469-vmss000000   <none>           <none>

The file /var/run/azure-vnet.json had the correct value after Cilium was up

@tgraf
Copy link
Member Author

tgraf commented Feb 28, 2020

test-me-please

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
integration/cloud Related to integration with cloud environments such as AKS, EKS, GKE, etc. kind/bug This is a bug in the Cilium logic. release-note/bug This PR fixes an issue in a previous release of Cilium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants