-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: proxy first request slowness issue #366
fix: proxy first request slowness issue #366
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good! Just left a comment on the healthcheck change
pkg/controller/health_check.go
Outdated
@@ -84,9 +83,5 @@ func (c *healthCheckerImpl) APIProxyAlive(ctx *gin.Context) bool { | |||
return false | |||
} | |||
defer resp.Body.Close() | |||
_, err = io.ReadAll(resp.Body) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need the line above if we don't read the body at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfrancisc what is the memory consumption issue with reading the body for the health endpoint?
The documentation for http.Get() says:
When err is nil, resp always contains a non-nil resp.Body. Caller should close resp.Body when done reading from it.
So unless there's some reason to do otherwise, we should probably follow the documentation. If there is a performance concern, I found a blog post that mentioned a workaround that we can investigate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe I am misunderstanding the documentation? I understood it as it's optimal to read the body before closing it but I saw some debate about whether reading the body is actually required. I guess that it's not required but there may be benefits to doing it.
This PR is old but indicates reading the body enables reusing the TCP connection google/go-github#317
We read the body before closing in the auth key manager too: https://github.com/codeready-toolchain/registration-service/blob/9ce1f44f6e48665508583f43e312bfbd215d2d2d/pkg/auth/keymanager.go#L154C1-L163
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unless something has changed in latest golang versions it was necessary to read the body before closing it otherwise the file descriptor would not be released. So at least in the past that would cause memory leaks. I remember we investigated such a memory leak a few years ago and the reason was not reading the body before closing it. Not sure if it matters in case of an empty body though. But it was considered a good practice to always read the body before closing it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The memory profile I've ran was showing a big bubble (high memory consumption) caused by the io.ReadAll
function (even if the json is small). But this was not part of the slowness fix, just to be clear, so it's not mandatory. I can remove there are concerns.
from the doc https://pkg.go.dev/net/http#Client.Get , it mentions that you must close the resp , but not that is also mandatory to read it. The drawback with not reading it is with the persistent TCP connection that might not be reused:
If the Body is not both read to EOF and closed, the Client's underlying RoundTripper (typically Transport) may not be able to re-use a persistent TCP connection to the server for a subsequent "keep-alive" request.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we still need the line above if we don't read the body at all?
@xcoulon yes closing the response body seems mandatory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
restored, please see: bd401ab
we can address this separately in case we want to have a look at the memory profile as well, since there seems to be other places where we are doing the same thing as @rajivnathan pointed out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another option we can consider is using a HEAD request instead of a GET request and then I believe there is no response body, which sounds good enough for the health endpoint.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good idea @rajivnathan !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent investigation and fix! 🥇 Just have a question about the health endpoint change.
pkg/controller/health_check.go
Outdated
@@ -84,9 +83,5 @@ func (c *healthCheckerImpl) APIProxyAlive(ctx *gin.Context) bool { | |||
return false | |||
} | |||
defer resp.Body.Close() | |||
_, err = io.ReadAll(resp.Body) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@mfrancisc what is the memory consumption issue with reading the body for the health endpoint?
The documentation for http.Get() says:
When err is nil, resp always contains a non-nil resp.Body. Caller should close resp.Body when done reading from it.
So unless there's some reason to do otherwise, we should probably follow the documentation. If there is a performance concern, I found a blog post that mentioned a workaround that we can investigate.
Codecov ReportAll modified and coverable lines are covered by tests ✅
📢 Thoughts on this report? Let us know!. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! Great job with investigating the issue!
Kudos, SonarCloud Quality Gate passed!
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @mfrancisc! 🙌
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: alexeykazakov, mfrancisc, rajivnathan, xcoulon The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
/lgtm |
The slowness was caused by the member cache being populated only with the first request.
I'm moving the the cluster cache initialization before starting the services , so that the first request will have it already populated.
Jira: https://issues.redhat.com/plugins/servlet/samlsso?redirectTo=%2Fbrowse%2FSANDBOX-449