-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
server: wait to close connection until incoming socket is drained (with timeout) #6977
Conversation
Codecov Report
Additional details and impacted files@@ Coverage Diff @@
## master #6977 +/- ##
==========================================
+ Coverage 82.38% 82.45% +0.07%
==========================================
Files 296 296
Lines 31432 31450 +18
==========================================
+ Hits 25896 25933 +37
+ Misses 4477 4460 -17
+ Partials 1059 1057 -2
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice! LGTM
Can we get a release with this fix please? Keen to try and see if it fixes our troubles. |
It will be in 1.62.0 next week. I'll see if it's easy to backport to 1.61.x and push a 1.61 patch release if so. |
I pushed a patch release with the fix: https://github.com/grpc/grpc-go/releases/tag/v1.61.1 |
Thank you! |
// https://github.com/grpc/grpc-go/issues/5358 | ||
select { | ||
case <-t.readerDone: | ||
case <-time.After(time.Second): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that no one can stop this scheduled timer when the reader is done first (which would probably happen most of the times). I would suggest to change it to a timer
object so it can be stopped manually (and therefore released) when the reader is done first.
In a scenario where this happens with many clients (or the same client) repeatedly wouldn't you just schedule many timers that cannot be stopped until they fire (i.e a temporary memory leak)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True that it isn't stopped, but it is fixed for 1 second, so I don't believe this could be a real problem for anyone.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even when many connections having non IO errors? Do you think it's unlikely or even not possible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the key here is that it's a non-I/O error, which means it would be initiated by the server itself. If the server is terminating already-handshaked connections at a rate of >1 per second, then that seems like a real problem.
That said, I think it's OK to just forbid the use of time.After
in our repo (outside of tests) since it's always a potential concern. I'll add a vet.sh
check for this and change it on master.
Replaces #6957
Fixes #5358 (A modified version of the test in that issue runs for 5+ minutes without failing with these changes. On a broken branch it fails reliably within 30s, and often much faster.)
See also grpc/grpc-java#9566 and golang/net@cd69bc3 / golang/go#18701
This change also adjusts the ping timer after sending the initial GOAWAY from 1 minute to 5 seconds at @ejona86's recommendation.
RELEASE NOTES: