-
Notifications
You must be signed in to change notification settings - Fork 5.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Argo CD synchronization lasts incredibly long #3663
Comments
I'm willing to guess that ArgoCD doesn't know how to check the "status" of CRDs. But I'm not sure exactly. This is a common problem with Spinnaker too. |
Im having the same issue with |
Same issue when i'm trying to install istio, argocd stuck with the CRD's. k8s version 1.16.8 |
Short version
I recommend closing this issue, and (if one does not already exist) opening a new enhancement to examine how to better handle the A) case below, where Argo CD crashes/is restarted/stopped while a sync operation is in progress. (Another option is to repurpose this current issue to handle A), but IMHO a clean slate is better) Long versionThere are a few different behaviours being described here, which I'll address one at a time: A) Synchronization takes a 'really long time'It is currently possible for an Argo CD application's sync operation state to appear to get "stuck" in a running state, which can make it look like it is taking 'a really long time', when in fact no sync operation is taking place. When this happens, Argo CD thinks an operation is in progress (for example, reporting in the web UI that an operation is ongoing) when in fact it is not. This has the potential to occur any time the Argo CD controller process is prematurely stopped (for example, due to a Argo CD controller crash). (I personally see this 'stuck operation' during Argo CD development, where, during debug, I kill and restart the Argo CD controller container when it is in the middle of a long-running sync operation.) This behaviour is due to the nature of how an operation's state is stored by Argo CD. It stores it in the Argo CD Application CRD in k8s (backed by etcd):
The Argo CD controller keeps track of which operation is running, and updates the 'operationState' field as that operation progresses. However, if the Argo CD controller process is restarted, it does not appear to have a way to detect that 'operationState' is no longer valid, and thus the You can terminate an operation in this state from within the UI, by clicking on In practice, this shouldn't happen except in rare cases where the controller dies unexpectedly during a sync (which, since there are no log files attached to this issue, I'm not sure we can investigate the specific trigger, here) B) 'CustomResourceDefinitions are not ready' for prometheus-operator chartIt turns out this is not an Argo CD issue, but rather due to the behaviour of the prometheseus-operator itself. When you look at the difference between what Argo CD expects (desired state), and what it finds (live state), you will see the only difference is this:
Argo CD expects to find the above field in annotations, but the CRD itself does not contain it. Why is that? Well, Argo CD is applying the correct version of the manifest CRD containing this field:
So who is doing the overwriting of the "good" desired version of the CRD, with the "bad" live version of the CRD? The prometheus-operator deployment itself! Before version v0.39.0 of the prometheus-operator, the operator Helm chart starts the operator with the following parameter: This parameter, To confirm if this is the issue you are seeing, you can
(notice the 'CRD Updated' message, as it updates the CRDs one-by-one) Fortunately it appears that this parameter is no longer in use in newer versions of the prometheus operator, so you may get better luck out of those versions. In any case, this is not an Argo CD issue, and mechanisms exist in Argo CD to ignore differences like this. C) Log message:
|
@AlehB can you confirm that this issue still exists on 1.8? |
For large applications, the v1.9 feature to only apply objects which are outofsync will help here. |
This should have been resolved with the introduction of Server-Side Apply, feel free to re-open if that is not the case. |
Describe the bug
Hello team,
We are trying to install prometheus-operator helm chart (https://github.com/helm/charts/tree/master/stable/prometheus-operator) in our Kubernetes cluster with Argo CD
We encountered two problems:
After the chart is added in Argo CD dashboard and manual sync is started, it took about an hour for Argo to just begin to actually sync (begin to create kubernetes resources)
Synchronization lasts incredibly long. It is already lasts about 17 hours and the application is still not fully launched (No events available in Argo CD dashboard and no any errors)
We've tried it several times with different helm chart versions
For other helm chart (very small ones) our Argo CD installation works fine
Is there any option to speed up the start of the application? Now Argo CD looks like an unsuitable option for such Helm charts
To Reproduce
Expected behavior
Argo CD begins to create kubernetes resources immediately
It takes a reasonable amount of time to get everything ready
Screenshots
Sync status at the moment
As an example, CustomResourceDefinitions are not ready
Version
The text was updated successfully, but these errors were encountered: