Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: proxy first request slowness issue #366

Merged
merged 4 commits into from
Oct 26, 2023

Conversation

mfrancisc
Copy link
Contributor

@mfrancisc mfrancisc commented Oct 26, 2023

The slowness was caused by the member cache being populated only with the first request.
I'm moving the the cluster cache initialization before starting the services , so that the first request will have it already populated.

Jira: https://issues.redhat.com/plugins/servlet/samlsso?redirectTo=%2Fbrowse%2FSANDBOX-449

Copy link
Contributor

@xcoulon xcoulon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Just left a comment on the healthcheck change

@@ -84,9 +83,5 @@ func (c *healthCheckerImpl) APIProxyAlive(ctx *gin.Context) bool {
return false
}
defer resp.Body.Close()
_, err = io.ReadAll(resp.Body)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need the line above if we don't read the body at all?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mfrancisc what is the memory consumption issue with reading the body for the health endpoint?

The documentation for http.Get() says:

When err is nil, resp always contains a non-nil resp.Body. Caller should close resp.Body when done reading from it.

So unless there's some reason to do otherwise, we should probably follow the documentation. If there is a performance concern, I found a blog post that mentioned a workaround that we can investigate.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe I am misunderstanding the documentation? I understood it as it's optimal to read the body before closing it but I saw some debate about whether reading the body is actually required. I guess that it's not required but there may be benefits to doing it.

This PR is old but indicates reading the body enables reusing the TCP connection google/go-github#317

We read the body before closing in the auth key manager too: https://github.com/codeready-toolchain/registration-service/blob/9ce1f44f6e48665508583f43e312bfbd215d2d2d/pkg/auth/keymanager.go#L154C1-L163

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless something has changed in latest golang versions it was necessary to read the body before closing it otherwise the file descriptor would not be released. So at least in the past that would cause memory leaks. I remember we investigated such a memory leak a few years ago and the reason was not reading the body before closing it. Not sure if it matters in case of an empty body though. But it was considered a good practice to always read the body before closing it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The memory profile I've ran was showing a big bubble (high memory consumption) caused by the io.ReadAll function (even if the json is small). But this was not part of the slowness fix, just to be clear, so it's not mandatory. I can remove there are concerns.

from the doc https://pkg.go.dev/net/http#Client.Get , it mentions that you must close the resp , but not that is also mandatory to read it. The drawback with not reading it is with the persistent TCP connection that might not be reused:

If the Body is not both read to EOF and closed, the Client's underlying RoundTripper (typically Transport) may not be able to re-use a persistent TCP connection to the server for a subsequent "keep-alive" request.

Copy link
Contributor Author

@mfrancisc mfrancisc Oct 26, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we still need the line above if we don't read the body at all?

@xcoulon yes closing the response body seems mandatory

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

restored, please see: bd401ab

we can address this separately in case we want to have a look at the memory profile as well, since there seems to be other places where we are doing the same thing as @rajivnathan pointed out.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another option we can consider is using a HEAD request instead of a GET request and then I believe there is no response body, which sounds good enough for the health endpoint.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good idea @rajivnathan !

Copy link
Contributor

@rajivnathan rajivnathan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent investigation and fix! 🥇 Just have a question about the health endpoint change.

@@ -84,9 +83,5 @@ func (c *healthCheckerImpl) APIProxyAlive(ctx *gin.Context) bool {
return false
}
defer resp.Body.Close()
_, err = io.ReadAll(resp.Body)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mfrancisc what is the memory consumption issue with reading the body for the health endpoint?

The documentation for http.Get() says:

When err is nil, resp always contains a non-nil resp.Body. Caller should close resp.Body when done reading from it.

So unless there's some reason to do otherwise, we should probably follow the documentation. If there is a performance concern, I found a blog post that mentioned a workaround that we can investigate.

@codecov
Copy link

codecov bot commented Oct 26, 2023

Codecov Report

All modified and coverable lines are covered by tests ✅

Files Coverage Δ
pkg/proxy/proxy.go 85.34% <ø> (-0.16%) ⬇️

📢 Thoughts on this report? Let us know!.

Copy link
Contributor

@alexeykazakov alexeykazakov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Great job with investigating the issue!

@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 0 Code Smells

No Coverage information No Coverage information
1.8% 1.8% Duplication

Copy link
Contributor

@xcoulon xcoulon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work @mfrancisc! 🙌

@openshift-ci
Copy link

openshift-ci bot commented Oct 26, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: alexeykazakov, mfrancisc, rajivnathan, xcoulon

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:
  • OWNERS [alexeykazakov,xcoulon]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@alexeykazakov
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm label Oct 26, 2023
@openshift-ci openshift-ci bot merged commit 8e10a70 into codeready-toolchain:master Oct 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants