-
Notifications
You must be signed in to change notification settings - Fork 462
controller: Add a 5s delay before rendering MCs #303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
controller: Add a 5s delay before rendering MCs #303
Conversation
To reduce churn if MCs are being created rapidly - both on general principle, and also to reduce our exposure to the current bug that a booting node may fail to find a GC'd MachineConfig: openshift#301
|
TF failures /retest |
|
Hmm, if I'm reading this right, even if we translate three MC creation events by 5s, we're still regenerating three times, right? Though I guess they'll hash to the same name now at least, so we won't get a generated MC quickly appearing then getting deleted. LGTM though will defer to folks more familiar with the workqueue API. /approve |
See https://godoc.org/k8s.io/client-go/util/workqueue
|
|
tf failures again... looks like somethings up. Let's retest shortly. |
|
OK, after reading up some more on the workqueue API, I'm more confident this works now. I've also just tested it! /lgtm Re.
Right, I see that at https://github.com/kubernetes/client-go/blob/b831b8de7155117e51afaffeb647007a756ddc92/util/workqueue/queue.go#L114. But this happens at Anyway, this still mitigates the issue fine since the MCs we're concerned about definitely happen within a 5s window. So even on the earliest event fired, we've already got all the MCs. Though of course closing the race window completely will still require some work. |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: cgwalters, jlebon The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
/test e2e-aws |
|
/retest |
|
/test e2e-aws |
This is like openshift#303 but for the node controller. We really don't need to react *instantly* to start updating and rebooting machines, and having a small delay will help avoid races when MCs are created rapidly.
…ared Bug 1854857: initial create errors should map to SamplesExists instead of ImageChangesInProgress
To reduce churn if MCs are being created rapidly - both on general
principle, and also to reduce our exposure to the current bug
that a booting node may fail to find a GC'd MachineConfig:
#301