-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make sure gcp-auth addon can be enabled on startup #9318
Conversation
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: sharifelgamal The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
Travis tests have failedHey @sharifelgamal, 1st Buildmake test
TravisBuddy Request Identifier: c0eef750-fe95-11ea-803e-573615702161 |
pkg/addons/addons.go
Outdated
for _, a := range toEnableList { | ||
// Run the gcp-auth addon last to make sure everything else is up |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is still a race condition, as addon enablement does not block until the pods are available.
I think you can remove this special case by naming a failurePolicy
for the webhook instead, so that it fails open if unavailable:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For an example test case: /out/minikube start --addons=dashboard --addons=efk --addons=freshpod --addons=istio --addons=istio-provisioner --addons=olm --memory=16384 --driver=hyperkit
That said, it is a race condition, and may not be easy to trigger. Better to just fail open.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think adding the failurePolicy is a good idea just in general, but this code isn't supposed to eliminate the race, just reduce it as much as possible. In my testing it seems to be very effectively at precisely that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that failurePolicy is in place, can this ordering code be safely removed?
The nice thing about failurePolicy is that it also eliminates the race condition when you use minikube addons enable
while another pod is scheduling.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so just in my empirical testing, the race condition still happens fairly often without this code, which makes me think I don't quite understand what's happening here.
- key: name | ||
operator: NotIn | ||
values: | ||
- kube-system |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1 to filtering kube-system out.
Travis tests have failedHey @sharifelgamal, 1st Buildmake test
TravisBuddy Request Identifier: 8eaad860-ff4c-11ea-b444-3dcc0bd866ab |
/ok-to-test |
kvm2 Driver Times for Minikube (PR 9318): 60.1s 59.7s 60.3s Averages Time Per Log
docker Driver Times for Minikube (PR 9318): 30.2s 28.4s 29.0s Averages Time Per Log
|
kvm2 Driver Times for Minikube (PR 9318): 59.3s 59.0s 59.0s Averages Time Per Log
docker Driver Times for Minikube (PR 9318): 28.5s 28.2s 28.3s Averages Time Per Log
|
/ok-to-test |
kvm2 Driver Times for Minikube (PR 9318): 59.4s 61.5s 60.2s Averages Time Per Log
docker Driver Times for Minikube (PR 9318): 29.7s 28.4s 31.6s Averages Time Per Log
|
Travis tests have failedHey @sharifelgamal, 1st Buildmake test
TravisBuddy Request Identifier: ee3cfa40-01c9-11eb-9be6-0d9634a5fbe7 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
optoinally please add docs to addons contributors, if they want their addon to be defered...to be added to this lit
kvm2 Driver Times for Minikube (PR 9318): 89.4s 243.2s 242.6s Averages Time Per Log
docker Driver Times for Minikube (PR 9318): 211.4s 210.5s 210.3s Averages Time Per Log
|
kvm2 Driver Times for Minikube (PR 9318): 240.8s 241.4s 240.2s Averages Time Per Log
docker Driver Times for Minikube (PR 9318): 211.4s 210.8s 210.6s Averages Time Per Log
|
kvm2 Driver Times for Minikube (PR 9318): 61.0s 61.8s 61.8s Averages Time Per Log
docker Driver Times for Minikube (PR 9318): 29.6s 28.7s 29.8s Averages Time Per Log
|
This PR makes sure the gcp-auth addons can be enabled on startup safely by making it the last addon enabled. It also fixes a documentation issue in an error message and makes sure we never apply the webhook to the internal kube-system pods.
Fixes #9208
Fixes #9215