fix: re-resolve symbolic target revisions on sync retry by rohansood10 · Pull Request #26925 · argoproj/argo-cd

rohansood10 · 2026-03-19T21:02:51Z

When a sync with targetRevision: HEAD fails and retries, the retry reuses the originally-resolved commit SHA instead of re-resolving HEAD. If HEAD has moved (which it usually has), the retry applies manifests from the old commit — potentially rolling back the cluster to a stale state.

The fix clears the resolved revision on retry when the target is a symbolic ref (HEAD, branch name, tag). Literal SHAs are left alone since they're already pinned. Uses the existing git.IsCommitSHA() helper to distinguish.

Relates to #23547

When a sync operation fails and retries, symbolic refs like HEAD or branch names were not re-resolved, causing the retry to use the originally-resolved commit SHA which could be months old. This led to silent rollbacks of cluster resources to stale states. The retry logic only cleared the resolved revision when Retry.Refresh was explicitly set to true. For the common case where Retry.Refresh is false (the default), the stale SHA persisted across retries. Fix by also clearing the resolved revision on retry when the target revision is a symbolic ref (not a 40-character hex SHA), forcing re-resolution to the latest commit. Literal SHA target revisions are preserved since they always point to the same commit. Fixes argoproj#26530 Signed-off-by: Rohan Sood <56945243+rohansood10@users.noreply.github.com>

bunnyshell · 2026-03-19T21:02:57Z

🔴 Preview Environment stopped on Bunnyshell

See: Environment Details | Pipeline Logs

Available commands (reply to this comment):

🔵 /bns:start to start the environment
🚀 /bns:deploy to redeploy the environment
❌ /bns:delete to remove the environment

codecov · 2026-03-19T23:06:02Z

Codecov Report

❌ Patch coverage is 90.00000% with 2 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.17%. Comparing base (aabe852) to head (9610088).

Files with missing lines	Patch %	Lines
controller/appcontroller.go	90.00%	1 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #26925   +/-   ##
=======================================
  Coverage   63.16%   63.17%           
=======================================
  Files         414      414           
  Lines       56461    56480   +19     
=======================================
+ Hits        35664    35681   +17     
- Misses      17427    17430    +3     
+ Partials     3370     3369    -1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

olivergondza · 2026-03-20T16:55:07Z

@rohansood10, when implementing the Retry.Refresh, the community sentiment was not to change the default behavior, and only resolve those when explicitly configured: #11494 (comment).

Correct if I am wrong, but this change practically does the refresh regardless of the option's value, right?

rohansood10 · 2026-03-20T18:45:37Z

Good question - and you're right to flag this. The change here is intentionally scoped narrower than toggling Retry.Refresh globally.

What this does: when the target revision is a symbolic ref (HEAD, branch name, tag) rather than a literal SHA, it clears the resolved revision on retry so it gets re-resolved. It does NOT change the Retry.Refresh flag itself, so the full app refresh behavior (CompareWithLatest, etc.) is still opt-in.

The distinction is: re-resolving a symbolic ref to its current SHA is a lightweight git operation, while Retry.Refresh triggers a full comparison cycle with manifest generation. The bug is specifically that a stale SHA gets reused when HEAD has moved, which can silently roll back cluster state to a months-old commit.

That said, I see the concern about changing default retry behavior. Would it be cleaner to gate this behind a new sync option like ResolveRefsOnRetry to keep it fully opt-in? Or does the current approach feel proportionate given the data corruption risk?

cc the discussion at #11494 (comment)

olivergondza · 2026-03-20T20:10:33Z

The bug is specifically that a stale SHA gets reused when HEAD has moved, which can silently roll back cluster state to a months-old commit.

Can you elaborate on how is this possible? I do not see the connection with this described behavior from #26530.

The distinction is: re-resolving a symbolic ref to its current SHA is a lightweight git operation, while Retry.Refresh triggers a full comparison cycle with manifest generation.

Have you maybe forgot to push your full change set? I see it does the exact same thing as .Refresh - not setting the fag, but activating the same behavior for symbolic revisions.

Was any AI assistant used to produce the contribution?

rohansood10 · 2026-03-23T18:47:50Z

Sure - the scenario from #26530 is:

User creates an Application with targetRevision: HEAD
First sync resolves HEAD to commit abc123, stores it in state.Operation.Sync.Revision
Sync fails (e.g., webhook rejection, hook timeout)
Meanwhile, 50+ commits are pushed to the repo
On retry, if Retry.Refresh is false (the default), the retry reuses abc123 from step 2 instead of re-resolving HEAD
The manifests from the months-old abc123 get applied, silently rolling back the cluster

The issue reporters saw this in production with HA mode where retries can be delayed. The fix specifically targets symbolic refs - literal SHAs are preserved since they're already pinned.

Regarding the opt-in concern - you're right that this changes default behavior for symbolic refs. Would gating it behind Retry.Refresh being set make more sense? The tradeoff is that the data corruption risk remains for the default case.

olivergondza · 2026-03-23T19:22:23Z

Hmm. I am still not convinced this is the culprit.

If step n. 6 downgrades the app when it finally succeeds to apply abc123, what have upgraded the app manifests beyond abc123 when the Argo app was struggling to do so, for - as you said - months?
From the issue it is not clear (to me at least) that there were an ongoing retry-loop going on for the entire time.

Would gating it behind Retry.Refresh being set make more sense?

See my question above. How is this different from what the flag does already (except that you skip SHAs)?

rohansood10 · 2026-03-23T19:59:06Z

Yeah fair question. Looking at the issue again, the reporters describe it as happening in HA setups with auto-sync + retry, but the exact sequence of how the cluster ended up ahead of the retried revision isn't fully clear from the logs they shared.

You're right that if the sync was failing continuously, nothing newer would have been deployed through ArgoCD, so the retry applying the original commit wouldn't technically be a rollback. The scenario where it becomes dangerous is if something else deployed a newer version in the meantime (manual kubectl apply, another sync that succeeded, etc.) and then the retry fires with the old SHA.

I'm less certain now that this is the right fix for #26530 specifically. Might be worth getting more detail from the reporters about what exactly happened in their case. Happy to close this if the consensus is that Retry.Refresh already covers the legitimate use case.

rohansood10 requested a review from a team as a code owner March 19, 2026 21:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: re-resolve symbolic target revisions on sync retry#26925

fix: re-resolve symbolic target revisions on sync retry#26925
rohansood10 wants to merge 1 commit intoargoproj:masterfrom
rohansood10:fix/26530-retry-stale-revision

rohansood10 commented Mar 19, 2026 •

edited

Loading

Uh oh!

bunnyshell bot commented Mar 19, 2026 •

edited

Loading

Uh oh!

codecov bot commented Mar 19, 2026

Uh oh!

olivergondza commented Mar 20, 2026

Uh oh!

rohansood10 commented Mar 20, 2026

Uh oh!

olivergondza commented Mar 20, 2026

Uh oh!

rohansood10 commented Mar 23, 2026

Uh oh!

olivergondza commented Mar 23, 2026 •

edited

Loading

Uh oh!

rohansood10 commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

rohansood10 commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

bunnyshell bot commented Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔴 Preview Environment stopped on Bunnyshell

Uh oh!

codecov bot commented Mar 19, 2026

Codecov Report

Uh oh!

olivergondza commented Mar 20, 2026

Uh oh!

rohansood10 commented Mar 20, 2026

Uh oh!

olivergondza commented Mar 20, 2026

Uh oh!

rohansood10 commented Mar 23, 2026

Uh oh!

olivergondza commented Mar 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rohansood10 commented Mar 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

rohansood10 commented Mar 19, 2026 •

edited

Loading

bunnyshell bot commented Mar 19, 2026 •

edited

Loading

olivergondza commented Mar 23, 2026 •

edited

Loading