fix(appset): Fix infinite loop caused by not ignoring statuses updates#19676
fix(appset): Fix infinite loop caused by not ignoring statuses updates#19676joliveirinha wants to merge 1 commit intoargoproj:masterfrom
Conversation
❌ Preview Environment deleted from BunnyshellAvailable commands (reply to this comment):
|
❌ Preview Environment deleted from BunnyshellAvailable commands (reply to this comment):
|
This commit fixes a bug where the appset controller updating the status of the resource itself, was causing it be notified for a new reconcile loop. This happened, because, altough the individual resource (app) statuses were not changing, the order of the list was, as such, resulting in an updated appset resource. The fix, adds a GenerationPredicate to the Watch on the appSet to guarantee that if no actual change to the AppSet change was done, then it can be ignored. Signed-off-by: João Oliveirinha <joliveirinha@gmail.com>
a1c0196 to
70c5e4b
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #19676 +/- ##
=========================================
Coverage ? 55.91%
=========================================
Files ? 316
Lines ? 43794
Branches ? 0
=========================================
Hits ? 24487
Misses ? 16762
Partials ? 2545 ☔ View full report in Codecov by Sentry. |
|
Thank you for the detailed description and the PR, @joliveirinha. |
gdsoumya
left a comment
There was a problem hiding this comment.
change looks good, a test would ensure we don't regress
todaywasawesome
left a comment
There was a problem hiding this comment.
Add tests and it looks 🔥
|
Sorry for the late response. Only had time to look at this now. It doesn't seem straight forward to implement a regression test for this. It seems the only possibility is to use envtest instead. I maybe missing something since this is my first time looking at this. |
|
Another option is to ensure order of the |
Hmm I have mixed feelings about that.
For 2) the current fix here proposed, gives all the flexibility for the implementation to decide what is the best requeue time. If we go via the direction you are suggesting, then a status update that changes the actual status content, will always trigger right away the notification from the runtime, which means you have less flexibility. You may argue, that it will only happen once because in the next iteration the status is probably the same, still.... To summarize, what I suggest, since the fix is small and relatively obvious, is for us to merge this PR and eventually when the project has integrations tests with the envtest, then it can cover this predicate and the rest that already exist to validate the runtime scenarios. Also, because this PR most probably also fix other related issues related to the progressive sync. |
|
But if you are using SCM Provider and someone creates a new repo, then it should reconcile if a new application is added? |
Yes. The fix doesn't change that. normal requeueing based on the configured requeue time will still happen. |
agaudreault
left a comment
There was a problem hiding this comment.
There are many cases where the behavior of the appset controller may depend on status updates adding back the object to the queue. One that comes to my mind is the progressive delivery. There will be less update events, so less calls to shouldRequeueApplicationSet. What are the implication of that? hard to know.
These kinds of scenario are usually coverer by the E2E test suite.
alexmt
left a comment
There was a problem hiding this comment.
I agree with @agaudreault . Argo CD application controller logic relies on reconciliation after changing the status. It is not ideal, but I'm afraid the AppicationSet controller is the same. We might open a can of worms by disabling reconciliation on status change.
I would implement this suggestion instead.
This commit fixes a bug where the appset controller updating the status of the resource itself, was causing it be notified for a new reconcile loop.
This happened, because, altough the individual resource (app) statuses were not changing, the order of the list was, as such, resulting in an updated appset resource.
The fix, adds a GenerationPredicate to the Watch on the appSet to guarantee that if no actual change to the AppSet change was done, then it can be ignored.
Fixes [#19675]
Checklist: