Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce forceAt and resetAt annotations #823

Merged
merged 2 commits into from
Dec 1, 2023
Merged

Conversation

hiddeco
Copy link
Member

@hiddeco hiddeco commented Nov 29, 2023

Implementation of #366 (comment)

This introduces two new annotations:

  • reconcile.fluxcd.io/resetAt: to reset the failure counts for a
    HelmRelease object.
  • reconcile.fluxcd.io/forceAt: to allow a one-off Helm install or
    upgrade when the controller would otherwise do nothing (e.g. due to
    being out of retries, in-sync, in a failed state, etc.)

Both annotations require the reconcile.fluxcd.io/requestedAt
annotation to be set at the same time, with the same token value.

Example

TOKEN="$(date +%s)"; \
kubectl annotate --field-manager=flux-client-side-apply --overwrite helmrelease/<name> \
"reconcile.fluxcd.io/requestedAt=$TOKEN" \
"reconcile.fluxcd.io/forceAt=$TOKEN"

Fixes #267
Fixes #366
Fixes #454
Fixes #297

@hiddeco hiddeco added the enhancement New feature or request label Nov 29, 2023
@hiddeco hiddeco force-pushed the reset-force-annotations branch 2 times, most recently from 3dde8c2 to 2a5a56c Compare November 29, 2023 13:36
@stefanprodan
Copy link
Member

Please open an issue in flux2 repo for adding the --force and --reset flags to flux reconcile hr command.

@hiddeco hiddeco force-pushed the reset-force-annotations branch 3 times, most recently from 6ace8c2 to 38aebfc Compare November 29, 2023 14:43
@hiddeco hiddeco marked this pull request as ready for review November 29, 2023 14:43
// If the force annotation is set, we can attempt to upgrade the release
// without any further checks.
if forceRequested {
log.Info(msgWithReason("forcing upgrade for failed release", "force requested through annotation"))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a question here around user desire: one option is to do what I implemented here, which is to force a move forward.

The other option would be to ignore the fact that retries are exhausted, and continue with the remediation attempt.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After thinking about this some more, and discussing it with @darkowlzz, I think the current behavior is what should be expected.

Given if someone wants to continue from the point where it runs into "retries exhausted", they can simply use the reset annotations. If force would do the same, it would bear no value to someone who has "unlimited retries" (retries: -1) configured, and this feature would be void.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussed this in a meeting, and i share this exact opinion.

Comment on lines +300 to +304
if forceRequested {
log.Info(msgWithReason("forcing upgrade for in-sync release", "force requested through annotation"))
return NewUpgrade(r.configFactory, r.eventRecorder), nil
}
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question around user desire here: do they want to force an upgrade, or do they only want to force it forward when it is in some failed state?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same kind of applies here, if you do not want to force an upgrade. Just request a "normal" reconciliation and nothing will happen.

This introduces two new annotations:

- `reconcile.fluxcd.io/resetAt`: to reset the failure counts for a
  `HelmRelease` object.
- `reconcile.fluxcd.io/forceAt`: to allow a one-off Helm install or
  upgrade when the controller would otherwise do nothing (e.g. due to
  being out of retries, in-sync, in a failed state, etc.)

Both annotations require the `reconcile.fluxcd.io/requestedAt`
annotation to be set at the same time, with the same token value.

Signed-off-by: Hidde Beydals <[email protected]>
This makes the controller actually take the
`reconcile.fluxcd.io/forceAt` and `reconcile.fluxcd.io/resetAt` into
account.

For `reconcile.fluxcd.io/resetAt`, this means that the failure counts on
the `HelmRelease` object are reset when the token value of the
annotation equals `reconcile.fluxcd.io/requestedAt`. Allowing the
controller to start over with attempting to install or upgrade the
release until the retries count has been reached again.

For `reconcile.fluxcd.io/forceAt`, this means that a one-off Helm
install or upgrade is allowed to take place even if the object is out of
retries, in a failed state where it should be remediated, or in-sync.

Signed-off-by: Hidde Beydals <[email protected]>
Copy link
Member

@stefanprodan stefanprodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Thanks @hiddeco 🏅

PS. I tested the annotations with various failure conditions and works as expected!

@hiddeco hiddeco merged commit 7f9160c into main Dec 1, 2023
10 checks passed
@hiddeco hiddeco deleted the reset-force-annotations branch December 1, 2023 10:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api API related issues and pull requests enhancement New feature or request
Projects
None yet
3 participants