e2e: disable gRPC GOAWAY/too_many_pings keepalive errors in e2e CI#19083
Conversation
Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Review ChecklistHello reviewers! 👋 Please follow this checklist when reviewing this Pull Request. General
Tests
Documentation
New flags
If a workflow is added or modified:
Backward compatibility
|
e2e: disable GOAWAY/too_many_pings errors in e2e CIe2e: disable gRPC GOAWAY/too_many_pings keepalive errors in e2e CI
Co-authored-by: Mohamed Hamza <mhamza@fastmail.com> Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com>
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #19083 +/- ##
=======================================
Coverage 69.90% 69.90%
=======================================
Files 1612 1612
Lines 215817 215789 -28
=======================================
- Hits 150865 150849 -16
+ Misses 64952 64940 -12 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@timvaillancourt IMO these logs are not helpful, at all, to the end user. Why don't we disable them entirely in Vitess rather than just in the tests? |
@mattlord unfortunately the gRPC library can't silence the logging unless you disable the protection entirely like we're doing here. I don't recommend we do that in a non-test environment I think the built-in protection still makes sense in a non-e2e use case, because it's possible for the gRPC port to abused by attackers. Arguably this change shouldn't have been necessary if we used the gRPC client by-the-book, but it seems we often launch a single tmclient per-go-package in Anecdotally, the only time I've seen this error/wait as a Vitess user in a real deployment was when writing external automation that hits vtctld APIs, possibly without closing connections properly as well, or perhaps doing something too quickly (the details of the in-gRPC protection aren't too clear to me) |
… CI (vitessio#19083) Signed-off-by: Tim Vaillancourt <tim@timvaillancourt.com> Signed-off-by: Mohamed Hamza <mhamza@fastmail.com>
Description
This PR disables too-many-ping protection in gRPC when running e2e tests with
test.go. This is achieved by settingGRPC_ARG_HTTP2_MAX_PINGS_WITHOUT_DATAto0Context from docs:
and
Today, many e2e tests connect/reconnect to the gRPC server too quickly and hit the
GOAWAY/too_many_pingserror. When this occurs, the victim test has to wait theGRPC_ARG_HTTP2_MIN_RECV_PING_INTERVAL_WITHOUT_DATA_MSdefault of 300000 milliseconds (5 minutes) doing nothing 😱. This protection makes sense in a normal world, but in an e2e test all running on localhost we don't really careWhen tests hit this, we see this error and pause for 5min:
E1218 18:18:06.674435 32993 component.go:44] [transport] Client received GoAway with error code ENHANCE_YOUR_CALM and debug data equal to ASCII "too_many_pings".Docs: https://github.com/grpc/grpc/blob/master/doc/keepalive.md
Related Issue(s)
Checklist
Deployment Notes
AI Disclosure