Fix Teleport update reconciliation on status updates#34063
Merged
Conversation
This pull request addresses the issue where the Teleport operator reconciliation runs every time the operator updates the `status` subresource. This continuous reconciliation has led to an infinite loop, causing millions of reconciliations per minute. When an error occurs, such as having invalid role properties, the Operator updates the status and returns an error, which should trigger a rescheduled reconciliation with exponential backoff. The problem arises because the operator failed to enforce a resource generation change, resulting in an immediate trigger of a new reconciliation when the `status` field is updated. This pull request modifies the operator to avoid updating subresources and only trigger updates when there is a change in resource generation. Special thanks to @strideynet for confirming my hypothesis and giving the solution! Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
Contributor
|
The PR changelog entry failed validation: Changelog entry not found in the PR body. Please add a "no-changelog" label to the PR, or changelog lines starting with |
marcoandredinis
approved these changes
Oct 31, 2023
strideynet
approved these changes
Oct 31, 2023
Contributor
strideynet
left a comment
There was a problem hiding this comment.
Might be nice to move meta.SetStatusCondition into silentUpdateStatus to reduce the likelihood of this occurring again - otherwise lgtm
Contributor
Author
done in c433dc9 |
strideynet
approved these changes
Nov 2, 2023
hugoShaka
approved these changes
Nov 2, 2023
c433dc9 to
bd9b2db
Compare
tigrato
added a commit
that referenced
this pull request
Nov 3, 2023
* Fix Teleport update reconciliation on `status` updates This pull request addresses the issue where the Teleport operator reconciliation runs every time the operator updates the `status` subresource. This continuous reconciliation has led to an infinite loop, causing millions of reconciliations per minute. When an error occurs, such as having invalid role properties, the Operator updates the status and returns an error, which should trigger a rescheduled reconciliation with exponential backoff. The problem arises because the operator failed to enforce a resource generation change, resulting in an immediate trigger of a new reconciliation when the `status` field is updated. This pull request modifies the operator to avoid updating subresources and only trigger updates when there is a change in resource generation. Special thanks to @strideynet for confirming my hypothesis and giving the solution! Signed-off-by: Tiago Silva <tiago.silva@goteleport.com> * return proper status conditions on failures * enforce condition update on silentUpdateStatus --------- Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
tigrato
added a commit
that referenced
this pull request
Nov 3, 2023
* Fix Teleport update reconciliation on `status` updates This pull request addresses the issue where the Teleport operator reconciliation runs every time the operator updates the `status` subresource. This continuous reconciliation has led to an infinite loop, causing millions of reconciliations per minute. When an error occurs, such as having invalid role properties, the Operator updates the status and returns an error, which should trigger a rescheduled reconciliation with exponential backoff. The problem arises because the operator failed to enforce a resource generation change, resulting in an immediate trigger of a new reconciliation when the `status` field is updated. This pull request modifies the operator to avoid updating subresources and only trigger updates when there is a change in resource generation. Special thanks to @strideynet for confirming my hypothesis and giving the solution! Signed-off-by: Tiago Silva <tiago.silva@goteleport.com> * return proper status conditions on failures * enforce condition update on silentUpdateStatus --------- Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
tigrato
added a commit
that referenced
this pull request
Nov 3, 2023
* Fix Teleport update reconciliation on `status` updates This pull request addresses the issue where the Teleport operator reconciliation runs every time the operator updates the `status` subresource. This continuous reconciliation has led to an infinite loop, causing millions of reconciliations per minute. When an error occurs, such as having invalid role properties, the Operator updates the status and returns an error, which should trigger a rescheduled reconciliation with exponential backoff. The problem arises because the operator failed to enforce a resource generation change, resulting in an immediate trigger of a new reconciliation when the `status` field is updated. This pull request modifies the operator to avoid updating subresources and only trigger updates when there is a change in resource generation. Special thanks to @strideynet for confirming my hypothesis and giving the solution! Signed-off-by: Tiago Silva <tiago.silva@goteleport.com> * return proper status conditions on failures * enforce condition update on silentUpdateStatus --------- Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Nov 3, 2023
* Fix Teleport update reconciliation on `status` updates This pull request addresses the issue where the Teleport operator reconciliation runs every time the operator updates the `status` subresource. This continuous reconciliation has led to an infinite loop, causing millions of reconciliations per minute. When an error occurs, such as having invalid role properties, the Operator updates the status and returns an error, which should trigger a rescheduled reconciliation with exponential backoff. The problem arises because the operator failed to enforce a resource generation change, resulting in an immediate trigger of a new reconciliation when the `status` field is updated. This pull request modifies the operator to avoid updating subresources and only trigger updates when there is a change in resource generation. Special thanks to @strideynet for confirming my hypothesis and giving the solution! * return proper status conditions on failures * enforce condition update on silentUpdateStatus --------- Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Nov 3, 2023
* Fix Teleport update reconciliation on `status` updates This pull request addresses the issue where the Teleport operator reconciliation runs every time the operator updates the `status` subresource. This continuous reconciliation has led to an infinite loop, causing millions of reconciliations per minute. When an error occurs, such as having invalid role properties, the Operator updates the status and returns an error, which should trigger a rescheduled reconciliation with exponential backoff. The problem arises because the operator failed to enforce a resource generation change, resulting in an immediate trigger of a new reconciliation when the `status` field is updated. This pull request modifies the operator to avoid updating subresources and only trigger updates when there is a change in resource generation. Special thanks to @strideynet for confirming my hypothesis and giving the solution! * return proper status conditions on failures * enforce condition update on silentUpdateStatus --------- Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
github-merge-queue Bot
pushed a commit
that referenced
this pull request
Nov 3, 2023
* Fix Teleport update reconciliation on `status` updates This pull request addresses the issue where the Teleport operator reconciliation runs every time the operator updates the `status` subresource. This continuous reconciliation has led to an infinite loop, causing millions of reconciliations per minute. When an error occurs, such as having invalid role properties, the Operator updates the status and returns an error, which should trigger a rescheduled reconciliation with exponential backoff. The problem arises because the operator failed to enforce a resource generation change, resulting in an immediate trigger of a new reconciliation when the `status` field is updated. This pull request modifies the operator to avoid updating subresources and only trigger updates when there is a change in resource generation. Special thanks to @strideynet for confirming my hypothesis and giving the solution! * return proper status conditions on failures * enforce condition update on silentUpdateStatus --------- Signed-off-by: Tiago Silva <tiago.silva@goteleport.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This pull request addresses the issue where the Teleport operator reconciliation runs every time the operator updates the
statussubresource. This continuous reconciliation has led to an infinite loop, causing millions of reconciliations per minute. When an error occurs, such as having invalid role properties, the Operator updates the status and returns an error, which should trigger a rescheduled reconciliation with exponential backoff. The problem arises because the operator failed to enforce a resource generation change, resulting in an immediate trigger of a new reconciliation when thestatusfield is updated.This pull request modifies the operator to avoid updating subresources and only trigger updates when there is a change in resource generation.
Fixes #34092
Changelog: Skip Teleport Operator reconciliation on
statusupdatesSpecial thanks to @strideynet for confirming my hypothesis and giving the solution!