-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A bad jsonpatch can knock out the CAPI controller manager #8059
Comments
This issue is currently awaiting triage. If CAPI contributors determines this is a relevant issue, they will accept it by applying the The Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
Thx for the report. I'll take a look. |
@DanielXiao I was unable to reproduce the issue. I added the variable and patch but it didn't panic. Note: I also had to add the following to KubeadmControlPlaneTemplate: spec:
template:
spec:
kubeadmConfigSpec:
preKubeadmCommands: ["abc"] Before that I just got the error that it's unable to find the key that it's suposed to replace ( How does your KubeadmControlPlaneTemplate look like? (at the moment where the patch is applied, to simplify this you could move the patch to the beginning of the patches array) |
Okay the panic occurs because after setting the array to nil there is another patch trying to insert an element at the beginning of the (nil) array. I've opened an issue in the json patch repo to fix this specific panic: evanphx/json-patch#171 |
Opened a PR to catch panics during applying patches: #8067 |
What steps did you take and what happened:
[A clear and concise description on how to REPRODUCE the bug.]
Add a list variable to the ClusterClass
Add the following inline patch
Omit controlPlanePreKubeadmCommands in your Cluster variables and create the Cluster. You will see CAPI controller manager Pod is down.
It crashes for this error. You can delete CAPI controller manager Pod for recreating but new Pod always hit this error.
You can not delete the problematic cluster object either since validating webhook is also in the crashed pod.
There is no way to recover CAPI controller manager unless you delete CAPI's validatingwebhookconfiguration first.
What did you expect to happen:
CAPI controller manager Pod can tolerate errors from computing topology. Or provide a guidance for recovery.
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Environment:
kubectl version
):/etc/os-release
):/kind bug
[One or more /area label. See https://github.com/kubernetes-sigs/cluster-api/labels?q=area for the list of labels]
The text was updated successfully, but these errors were encountered: