Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ci: [CNI] Load testing for cilium cni #1871

Merged
merged 1 commit into from
Apr 11, 2023
Merged

Conversation

vipul-21
Copy link
Contributor

Reason for Change:
Add the load testing for cilium cni with aks cluster.
There are couple of scripts that does the testing.

  1. deployment script - it creates around 240 pods on each node and scale down to zero and repeat it 10 times.
  2. Stress test script - it checks the ips assigned to the pods are same as that in azure_endpoints json and cilium endpoints. It also restarts the systemd netword to check the connectivity works after that.

Requirements:

Notes:

@vipul-21 vipul-21 added the cni Related to CNI. label Mar 31, 2023
@vipul-21 vipul-21 force-pushed the singhvipul/loadtesting branch 10 times, most recently from ab789ce to a3ba93c Compare April 4, 2023 16:54
@vipul-21 vipul-21 marked this pull request as ready for review April 4, 2023 17:04
@vipul-21 vipul-21 requested a review from tamilmani1989 April 4, 2023 17:06
@vipul-21 vipul-21 force-pushed the singhvipul/loadtesting branch 3 times, most recently from 517d1d4 to 594b2bf Compare April 4, 2023 18:38
@vipul-21 vipul-21 force-pushed the singhvipul/loadtesting branch 5 times, most recently from 95fee90 to 2e26bdb Compare April 5, 2023 16:21
@vipul-21 vipul-21 requested review from rbtr and wedaly April 5, 2023 17:36
@vipul-21 vipul-21 requested a review from tamilmani1989 April 5, 2023 18:38
@vipul-21 vipul-21 force-pushed the singhvipul/loadtesting branch 2 times, most recently from 39e3de6 to 6f162e1 Compare April 5, 2023 19:08
@tamilmani1989
Copy link
Member

please make sure you address/resolve other comments as well

wedaly
wedaly previously approved these changes Apr 6, 2023
make -C ./hack/swift set-kubeconf AZCLI=az CLUSTER=${RESOURCE_GROUP}
make -C ./hack/swift azcfg AZCLI=az REGION=$(LOCATION)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

don't need these to bring the cluster down

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will the context be retained even if the bringing down of the cluster is a part of a different stage ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might need the azcfg target, but you definitely don't need the kubeconfig

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked removing the set kubeconfig and it failed with:
error: cannot delete cluster cilium-vipul-test, not in /home/vsts/.kube/config

Comment on lines 66 to 72
echo "install cilium CLI"
CILIUM_CLI_VERSION=$(curl -s https://raw.githubusercontent.com/cilium/cilium-cli/master/stable.txt)
CLI_ARCH=amd64
if [ "$(uname -m)" = "aarch64" ]; then CLI_ARCH=arm64; fi
curl -L --fail --remote-name-all https://github.com/cilium/cilium-cli/releases/download/${CILIUM_CLI_VERSION}/cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
sha256sum --check cilium-linux-${CLI_ARCH}.tar.gz.sha256sum
sudo tar xzvfC cilium-linux-${CLI_ARCH}.tar.gz /usr/local/bin
rm cilium-linux-${CLI_ARCH}.tar.gz{,.sha256sum}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

their script needs to be flexible, but we know what arch and version we want - fix all of these variables so that we get the same thing every time

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean hardcode the cli version and arch version ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, we should pin those so that the CI is consistent

@vipul-21 vipul-21 force-pushed the singhvipul/loadtesting branch 3 times, most recently from 6f5eceb to a9809a4 Compare April 10, 2023 17:47
@rbtr
Copy link
Contributor

rbtr commented Apr 10, 2023

needs rebase

@vipul-21 vipul-21 force-pushed the singhvipul/loadtesting branch from a9809a4 to c545c83 Compare April 10, 2023 21:08
containers:
- name: privileged-container
image: mcr.microsoft.com/dotnet/runtime-deps:6.0
command: ["/bin/sleep", "3650d"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we use mcr.microsoft.com/oss/kubernetes/pause:3.6 here too?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will need to check, let me try to use the image and see if that works.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried to use the image but it fails with the error of bash not in PATH.
"bash": executable file not found in $PATH:

echo "trying to get the cilium_endpoints"
kubectl exec -i "$cilium_agent" -n kube-system -- bash -c "cilium endpoint list -o json" > cilium_endpoints.json
sleep 10
done
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what timeout is set on the pipeline? If there's a persistent error, this will loop indefinitely, but I'd expect GH actions eventually kills the process?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default timeout is 60 mins.

wedaly
wedaly previously approved these changes Apr 10, 2023
@vipul-21 vipul-21 force-pushed the singhvipul/loadtesting branch from c545c83 to 03608f5 Compare April 10, 2023 22:38
@vipul-21 vipul-21 disabled auto-merge April 10, 2023 22:42
@vipul-21 vipul-21 removed the request for review from tamilmani1989 April 10, 2023 22:43
@vipul-21 vipul-21 enabled auto-merge (squash) April 10, 2023 23:19
@vipul-21 vipul-21 force-pushed the singhvipul/loadtesting branch from 03608f5 to 8c61539 Compare April 11, 2023 16:38
@vipul-21 vipul-21 merged commit 44fb03e into master Apr 11, 2023
@vipul-21 vipul-21 deleted the singhvipul/loadtesting branch April 11, 2023 18:39
rbtr pushed a commit that referenced this pull request Sep 8, 2023
ci:[CNI] Load testing for cilium cni
jpayne3506 pushed a commit that referenced this pull request Sep 11, 2023
ci:[CNI] Load testing for cilium cni
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Infra or tooling. cni Related to CNI.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants