-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
determine if there are situations when io.CopyN is needed to reuse underlying TCP connection #1575
Comments
So that I can better understand, what kind of cases are you finding where the And let's say we decide that this is a good idea and want to pursue this... I'm assuming this leads to performance gains... what kind of approximate gains are you anticipating? Finally, my biggest concern is... would this kind of change have any negative effects on existing clients using this package? Thank you, @itsksaurabh ! |
@gmlewis Thank you for reviewing this issue. I would like to answer your questions with the following explanation. The
If the response bodies aren't closed at all, each request will be sent on a separate TCP connection, and neither connection would ever be closed. If the bodies are closed without reading them, each request Dials and Closes its own TCP connection. But if you read and close the body, the second request will probably be sent on the same connection as the first, saving the time that would be used for the TCP and TLS handshakes on the second connection.
One of the good cases where you do not need the
Reading the Body is often more efficient than simply Close()ing if you're making more than one request - especially with TLS connections which are relatively expensive to create. Thanks again! |
Thank you, @itsksaurabh - I understand that part... so now let me explain why I asked what I asked previously... Question # 1:
I asked this because I'm wondering how often this client is closing the body without fully reading it? Is this happening a lot? Only rarely? Only specific endpoints? What made you come to the conclusion to open this issue in the first place? Question # 2:
Have you tested out your PR and find that you are experiencing much better performance? If so, can you give examples? How much performance increase are we talking about here? Questions # 3 and 4:
This is strictly a concern about RISK. We have thousands of users of this repo, and part of my responsibility is to NOT BREAK all the users. I have done that in the past, and believe me, it is no fun at all... yet, when potential performance increases can be made, we like to do that. Rarely will users of this repo come back and say "Wow, that was awesome that you did X", but they can be quite vocal if we broke Y. So I would like you to please take a moment and answer to the best of your ability, questions 1-4 above. Thank you! |
@gmlewis Thank you for your response. I completely understand your concern. So, I will try my best to answer all of your questions. Question # 1:
As mentioned earlier, one of the cases where you do not need the resp.Body is when you only need to use Head. Head doesn't require reading or closing the response body. For example, when a ping/pong service is made with the help of the lib. But things become worse when it comes to broken TCP connections and the system is highly stressed. Referring to the Go docs,
Also, a library user can't drain the Body themselves because Close() is called before passing the Response to the user. Question # 2:
Yes, I have tested the PR before the submission. I have done some tests and the performance is the same or sometimes better. Also, reliability increases with the approach and can reduce system-level errors and usage related to TCP. Please have a look at RFC 2068 that explains how TCP connections affect CPU as well as memory and also load on HTTP servers. Please have a look at the following quick tests that I have done with existing and my version of the lib. Questions # 3 and 4:
I believe there will be no negative effects on existing clients using this package as the overall design is the same and I have tested with it and getting the expected response as well. This approach will only drain the body in the defer func to make the TCP connection reusable resulting in lower fault/error rates, CPU, and memory usage. I think it is a win-win position that can be observed on the systems having high traffic and the problems that I have discussed above will be resolved with this approach. References:
Thank You! |
Thanks for reporting this issue @itsksaurabh. We've had a similar situation in the past, so I wanted to ask some questions to better understand why it has come up again. In the past, PR #317 added similar code to this library to drain the any remaining body for a speed up. Upon investigation, we discovered that there were exactly 0 bytes being read by As the Do you know if there are reproducible cases when the response body isn't read completely by this library? If so, it would be very helpful to learn how many bytes are being left unread:
Have you had a chance to measure and see how many bytes are being left unread in situations where TCP connections are not reused? Or this PR based on a hypothetical situation that you believe may be possible? I see #317 is mentioned as one of the references, but it's not clear to me if this has come up again because of a regression somewhere, or if this is addressing a different problem compared to last time. I agree that the fix (in PR #1576) is safe, but I'm asking these questions to see if we can improve the solution further by fixing it closer to the root of the problem (i.e., if it turns out there's a problem in the standard library or in GitHub API). Another reason to understand what purpose that change is serving is because other people may look at this library as an example of what to do, and we want to be confident we're setting a good example for others. Thanks. |
Thank you, @dmitshur ! Shall we reopen this issue to promote further discussion? |
Sure, let's reopen it until we understand this better. |
Thank you, @dmitshur ! As we can see the official doc says,
So I made the fix to make sure the TCP connection is reused from the Client's end. We already had this issue with some of our projects and this fix indeed solved some of the potential problems that may be faced with this lib.
As mentioned earlier, one of the cases where you do not need the resp.Body is when you only need to use Head. Head doesn't require reading or closing the response body. For example, when a ping/pong service is made with the help of the lib. But things become worse when it comes to broken TCP connections and the system is highly stressed. Also, a library user can't drain the Body themselves because Close() is called before passing the Response to the user.
I will definitely investigate this and will get back soon. I would really appreciate any help or guidance from you for this investigation to understand the situation even better. :) Thank you! |
Thanks for answering. If I understand your reply correctly, you're suggesting there are no known reproducible situations when the response body isn't read completely by this library. That is good to know.
I'm not sure I understand how this applies in this context. When making a HEAD request, the server isn't expected to include a response body, only the response headers. Also, most methods in this package specify the HTTP method internally and it is typically "GET".
One thing that's easy to check is to add some logging whenever the |
Currently, the Github client only closes the response body which can be seen at https://github.com/google/go-github/blob/master/github/github.go#L557 . The resp.Body.Close() makes the connection available for reuse by others using the same http.Client. It doesn’t close the underlying HTTP or TCP connection.
According to the official Go docs:
https://golang.org/pkg/net/http/#Body
My proposal to solve the issue:
The text was updated successfully, but these errors were encountered: