This repository has been archived by the owner on Nov 1, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Helm operator on AKS randomly deletes releases #1447
Comments
#1446 adds an helmOperator:
extraEnvs:
- name: KUBERNETES_PORT_443_TCP_ADDR
value: <your-fqdn-prefix>.hcp.<region>.azmk8s.io
- name: KUBERNETES_PORT
value: tcp://<your-fqdn-prefix>.hcp.<region>.azmk8s.io:443
- name: KUBERNETES_PORT_443_TCP
value: tcp://<your-fqdn-prefix>.hcp.<region>.azmk8s.io:443
- name: KUBERNETES_SERVICE_HOST
value: <your-fqdn-prefix>.hcp.<region>.azmk8s.io |
Are there any fixes we can apply within helm-op itself? |
Resolved in #1530 |
hiddeco
pushed a commit
to hiddeco/flux
that referenced
this issue
Nov 20, 2018
- purge a Helm release only if there is a single revision and that one failed - prevent Helm release deletion if Kubernetes API connectivity is flaky - fix fluxcd#1524 fluxcd#1447 (cherry picked from commit 9571c6a)
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
When Tiller runs in AKS (Azure's hosted Kubernetes service), it sometimes gets into an inconsistent state because of a combination of a known networking issue in AKS (Azure's hosted Kubernetes service) and client-go not handling intermittent network failures nicely.
This causes the helm operator to occasionally reinstall a release (instead of upgrading it as it should), which fails because the release already exists. The helm operator then purges the "failed" release.
A workaround for this (until the AKS team can apply it globally) is to apply the environment variables mentioned in Azure/AKS#676 to the helm operator pod.
The text was updated successfully, but these errors were encountered: