Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

perf: create (pipeline|task)run timeout checks in background #3302

Merged
merged 1 commit into from
Oct 9, 2020

Conversation

eddie4941
Copy link
Contributor

@eddie4941 eddie4941 commented Sep 29, 2020

Both the pipelinerun and taskrun controllers start timeout checks at
startup. They do this by iterating though each namespace and spawning a
go routine for each pipelinerun/taskrun to run the check in the
background. Although each timeout check is done in a go routine, the
iteration through namespaces and creation of the timeout checks is done
in a blocking manner. This adds significant latency at startup when the
number of namespaces is large. Ultimately, this causes a delay in how
fast each controllers can actually start reconciling resources. To
speed up the startup time, this changes the logic so that the iteration
through namespaces is done in the background. The timeout checks were
already carried out in separate go routines and were therefore safe to
use in a concurrent context, so no extra logic was needed to make this
change work in a concurrent context.

Changes

Submitter Checklist

These are the criteria that every PR should meet, please check them off as you
review them:

  • Includes tests (if functionality changed/added)
  • Includes docs (if user facing)
  • Commit messages follow commit message best practices
  • Release notes block has been filled in or deleted (only if no user facing changes)

See the contribution guide for more details.

Double check this list of stuff that's easy to miss:

Reviewer Notes

If API changes are included, additive changes must be approved by at least two OWNERS and backwards incompatible changes must be approved by more than 50% of the OWNERS, and they must first be added in a backwards compatible way.

Release Notes

controller and startup time is improved when lots of namespaces are being managed

@tekton-robot tekton-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Sep 29, 2020
@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Sep 29, 2020

CLA Check
The committers are authorized under a signed CLA.

@tekton-robot tekton-robot added the size/S Denotes a PR that changes 10-29 lines, ignoring generated files. label Sep 29, 2020
@tekton-robot tekton-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Sep 29, 2020
@tekton-robot
Copy link
Collaborator

Hi @eddie4941. Thanks for your PR.

I'm waiting for a tektoncd member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@@ -111,7 +111,7 @@ func TestTaskRunCheckTimeouts(t *testing.T) {
}

th.SetCallbackFunc(f)
th.CheckTimeouts(context.Background(), testNs, c.Kube, c.Pipeline)
go th.CheckTimeouts(context.Background(), testNs, c.Kube, c.Pipeline)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The test already use a polling approach below via wait.PollImmediate so Its okay not to block here. However, if leaving this here non blocking is a concern, im happy to block until the call is done to preserve old behavior.

@dibyom
Copy link
Member

dibyom commented Sep 29, 2020

/ok-to-test
/kind feature

@tekton-robot tekton-robot added kind/feature Categorizes issue or PR as related to a new feature. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Sep 29, 2020
Both the pipelinerun and taskrun controllers start timeout checks at
startup. They do this by iterating though each namespace and spawning a
go routine for each pipelinerun/taskrun to run the check in the
background. Although each timeout check is done in a go routine, the
iteration through namespaces and creation of the timeout checks is done
in a blocking manner. This adds significant latency at startup when the
number of namespaces is large. Ultimately, this causes a delay in how
fast each controllers can actually start reconciling resources. To
speed up the startup time, this changes the logic so that the iteration
through namespaces is done in the background. The timeout checks were
already carried out in separate go routines and were therefore safe to
use in a concurrent context, so no extra logic was needed to make this
change work in a concurrent context.
@imjasonh
Copy link
Member

imjasonh commented Oct 8, 2020

/lgtm

@tekton-robot tekton-robot added the lgtm Indicates that a PR is ready to be merged. label Oct 8, 2020
@tekton-robot
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: sbwsg

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@tekton-robot tekton-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Oct 9, 2020
@tekton-robot tekton-robot merged commit 27c76d2 into tektoncd:master Oct 9, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. kind/feature Categorizes issue or PR as related to a new feature. lgtm Indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/S Denotes a PR that changes 10-29 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants