feat(controller): Permit using newer revision when retrying failed sync (#11494)#23038
feat(controller): Permit using newer revision when retrying failed sync (#11494)#23038agaudreault merged 34 commits intoargoproj:masterfrom
Conversation
❌ Preview Environment deleted from BunnyshellAvailable commands (reply to this comment):
|
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## master #23038 +/- ##
==========================================
+ Coverage 60.32% 60.36% +0.03%
==========================================
Files 350 350
Lines 60032 60061 +29
==========================================
+ Hits 36217 36254 +37
+ Misses 20901 20895 -6
+ Partials 2914 2912 -2 ☔ View full report in Codecov by Sentry. |
|
@jannfis, I understand you had a PoC doing something similar, I would appreciate your review. Thanks! |
agaudreault
left a comment
There was a problem hiding this comment.
As discussed in contributor meeting, only auto-sync should have this behaviour. #11494 (comment)
|
If a sync fails, is it possible to differentiate between a retryable error (eg: nodes, resources, api server not available) and a non retryable error (eg: a manifest is not according to the schema, image does not exist etc). And the sync operation will retry only for a retryable error and for the other scenario, sync would just fail as it does not make sense to keep retrying if the error is say manifest not matching the schema. |
Interesting point. I have few observations:
Agreed that manifests not matching the schema make no sense retrying with same commit. BUT it is exactly the thing I would like Argo CD to retry with new commit ASAP. |
|
On the contributors meeting it was agreed that updating the commit sha (as originally implemented here) can be confusing for users as it does not indicate the fact a sync operation with HEAD~1 has failed. There was a consensus it is better to find out a way to get the current retry fail clearly, and let/force the new one to kick in. |
b33513d to
eb10df5
Compare
|
Thanks for the pointers, @agaudreault. I have changed the approach to let the current attempt fail, and let the new one to kick in, but I am struggling to get things to work. The phase never gets to Succeeded in the very last step. Also, the UI seems to be in an odd state. The "SYNC STATUS" reports success on last (fixed) commit, but "LAST SYNC" reports fail on an earlier commit.
Hitting [Sync] in E2E UI gets the app go completely green, but for some reason, it does not happen on its own. |
0e6c345 to
3259682
Compare
c267687 to
e7a8caa
Compare
Signed-off-by: Zadkiel AHARONIAN <hello@zadkiel.fr> Signed-off-by: Oliver Gondža <ogondza@gmail.com>
Signed-off-by: Zadkiel AHARONIAN <hello@zadkiel.fr> Signed-off-by: Oliver Gondža <ogondza@gmail.com>
Signed-off-by: Zadkiel AHARONIAN <hello@zadkiel.fr> Signed-off-by: Oliver Gondža <ogondza@gmail.com>
These are the use cases / business requirements to test with this change. The UI sync behaviour should also populate the refresh option and correct sources. The rollback experience is different and should never try to refresh. The behaviour of the UI vs CLI for the same operation should be consistent with how retries are configured for each. To make sure the operation can be refreshed correctly, you must make sure that the sources are properly set in the operation object. |
Signed-off-by: Zadkiel AHARONIAN <hello@zadkiel.fr> Signed-off-by: Oliver Gondža <ogondza@gmail.com>
Signed-off-by: Zadkiel AHARONIAN <hello@zadkiel.fr> Signed-off-by: Oliver Gondža <ogondza@gmail.com>
Signed-off-by: Zadkiel AHARONIAN <hello@zadkiel.fr> Signed-off-by: Oliver Gondža <ogondza@gmail.com>
2ba691b to
dd2e7ca
Compare
Signed-off-by: Oliver Gondža <ogondza@gmail.com>
Signed-off-by: Oliver Gondža <ogondza@gmail.com>
|
@agaudreault, I have added the tests (and prod code fix) for explicit CLI sync. Though, the (CLI) rollback seems to be a no-brainer: |
eb627ec to
8317692
Compare
Signed-off-by: Oliver Gondža <ogondza@gmail.com>
…494-agaudreault Signed-off-by: Alexandre Gaudreault <alexandre_gaudreault@intuit.com>
Signed-off-by: Oliver Gondža <ogondza@gmail.com>
…494-agaudreault Signed-off-by: Alexandre Gaudreault <alexandre_gaudreault@intuit.com>
cover sync/rollback scenarios
|
Thanks, @agaudreault. Your help was priceless! |
…nc (argoproj#11494) (argoproj#23038) Signed-off-by: Zadkiel AHARONIAN <hello@zadkiel.fr> Signed-off-by: Oliver Gondža <ogondza@gmail.com> Signed-off-by: Alexandre Gaudreault <alexandre_gaudreault@intuit.com> Co-authored-by: Zadkiel AHARONIAN <hello@zadkiel.fr> Co-authored-by: Alexandre Gaudreault <alexandre_gaudreault@intuit.com>

When sync fails due to incorrect manifest declarations, this permits fixes to be deployed via future commits.
This is controlled by a new boolean Application CRD field
syncPolicy.retry.refreshor via the--sync-retry-refreshflag.Closes #11494
Related to #6055
Discussed at https://www.youtube.com/watch?v=baIX9Bk6f5w&t=1173s
Kudos to @aslafy-z and @Sayrus for doing most of the heavy lifting here (#15603).
Checklist: