feat: add ignoreResourceUpdates to reduce controller CPU usage (#13534)#13912
feat: add ignoreResourceUpdates to reduce controller CPU usage (#13534)#13912crenshaw-dev merged 38 commits intoargoproj:masterfrom
ignoreResourceUpdates to reduce controller CPU usage (#13534)#13912Conversation
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Codecov ReportPatch coverage:
Additional details and impacted files@@ Coverage Diff @@
## master #13912 +/- ##
==========================================
- Coverage 49.61% 49.61% -0.01%
==========================================
Files 256 257 +1
Lines 43829 44146 +317
==========================================
+ Hits 21744 21901 +157
- Misses 19948 20091 +143
- Partials 2137 2154 +17
☔ View full report in Codecov by Sentry. |
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
ignoreResourceUpdates to reduce controller CPU usage (#13534)
|
This PR looks fantastic. It looks like could fix the app controller CPU high issues. |
|
Hi @agaudreault-jive |
|
@jaideepr97 If your question is related to why there are 2 different settings and ignoreDifferences could not be reused to skip the reconcile as well: In our case, ignore difference has more configuration than what is necessary for the reconcile optimization. It is also hard/impossible to know what everyone has configured. Having 2 configurations prevents the possibility of conflicts. However, |
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
…/argo-cd into reduce-object-reconcile
|
Agreed, this pr amazing. We are seeing 100% cpu endlessly, with the application-controller monitoring out 2k pods. This seems to be due metric changes on the HPA's. |
|
@donkeyx are you in a position to run this branch internally and monitor the effects? I'd be happy to help cherry pick these changes to whatever version you're running now. |
@crenshaw-dev we are currently running |
|
@donkeyx for HPA, we had the same issue and I used the following configs to make sure it works with v1 too. You might still see a couple of reconciles due to the ReplicaSet/Pods/Deployments updates, but no more from the HPA. |
Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
|
One "glitch" that I am seeing is when a ReplicaSet is scaled down (by HPA). When a pod is set to terminating, its health turns to "progressing", the Application health also changes to "progressing". However, when the pods "disappear" from the UI, the Application status is not updated and stays "Progressing". I expect the The app status is the following, but no resources in |
Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Co-authored-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com>
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
|
@agaudreault-jive thanks to you and your company for this significant contribution to improving performance |
|
I Added to my config an additional section: Then I restarted image: quay.io/argoproj/argocd:v2.8.0-rc1 |
|
Actually, after setting On the other hand, when |
|
@everythings-gonna-be-alright #14304 until this is merged and cherry-picked, you can use "debug" logs while |
…oproj#13534) (argoproj#13912) * feat: ignore watched resource update Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * add doc and CLI Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * update doc index Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * add command Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * codegen Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * revert formatting Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * do not skip on health change Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * update doc Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * update logging to use context Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * fix typos. local build broken... Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * change after review Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * manifestHash to string Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * more doc Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * example for argoproj Application Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * add unit test for ignored logs Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * codegen Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * Update docs/operator-manual/reconcile.md Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * move hash and set log to debug Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * Update util/settings/settings.go Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * Update util/settings/settings.go Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * feature flag Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * fix Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * less aggressive managedFields ignore rule Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * Update docs/operator-manual/reconcile.md Co-authored-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * use local settings Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * latest settings Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * safety first Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * since it's behind a feature flag, go aggressive on overrides Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> --------- Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
…oproj#13534) (argoproj#13912) * feat: ignore watched resource update Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * add doc and CLI Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * update doc index Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * add command Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * codegen Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * revert formatting Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * do not skip on health change Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * update doc Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * update logging to use context Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * fix typos. local build broken... Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * change after review Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * manifestHash to string Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * more doc Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * example for argoproj Application Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * add unit test for ignored logs Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * codegen Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * Update docs/operator-manual/reconcile.md Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * move hash and set log to debug Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * Update util/settings/settings.go Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * Update util/settings/settings.go Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * feature flag Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * fix Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * less aggressive managedFields ignore rule Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * Update docs/operator-manual/reconcile.md Co-authored-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> * use local settings Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * latest settings Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * safety first Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> * since it's behind a feature flag, go aggressive on overrides Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> --------- Signed-off-by: Alexandre Gaudreault <alexandre.gaudreault@logmein.com> Signed-off-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com> Co-authored-by: Michael Crenshaw <350466+crenshaw-dev@users.noreply.github.com>
|
Just to be sure, argocd is taking |
|
@ebuildy correct. You have a few example in https://argo-cd.readthedocs.io/en/stable/operator-manual/reconcile |
Closes #13534 #6108 #13614 #8471 #8100 #7406 #9014 #9819
Changes:
ignoreResourceUpdatesglobal configuration to ignore fields before to hash resources.ignoreDifferencesOnResourceUpdatesconfig to use ignoreDifferences automatically toignoreResourceUpdates.Refreshing app %s for change in cluster of object %s of type %s/%sdebug log to info to help get statistics and configureignoreResourceUpdates.msg="Refreshing app*for change*" | rex field=msg "Refreshing app (?<application>\S+) for change in cluster of object (?<resource>\S+) of type (?<type>\S+)" | stats count by application resource type | sort -countcan be used.This was a result of adding the following config
During business hours, after optimization.


Checklist:
Please see Contribution FAQs if you have questions about your pull-request.