Skip to content

Manual Backport of Add ServiceResolver RequestTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable into release/1.13.x#16563

Merged
andrewstucki merged 4 commits intorelease/1.13.xfrom
release-1.13.x-backport-resolver-request-timeout
Mar 7, 2023
Merged

Manual Backport of Add ServiceResolver RequestTimeout for route timeouts to make TerminatingGateway upstream timeouts configurable into release/1.13.x#16563
andrewstucki merged 4 commits intorelease/1.13.xfrom
release-1.13.x-backport-resolver-request-timeout

Conversation

@andrewstucki
Copy link
Contributor

Backport

Manual backport from #16495 to release/1.13.x

The below text is copied from the body of the original PR.


Description

I'm not entirely sure about this approach because it's unclear to me where we generally use ConnectTimeout for ServiceResolvers and whether or not this is conflating two timeout values that shouldn't be conflated. If so, I can add another timeout value that makes this configurable. Edit: Adding a RequestTimeout parameter for ServiceResolvers to handle the timeout. That said, this change allows for overriding Envoy's default timeout for establishing connections to upstream clusters in a route configuration.

I tested it manually with a custom go binary run on the mesh with a TerminatingGateway and ServiceResolver that extends the timeout to 30 seconds so that a server that takes 20 seconds to respond doesn't timeout. I'll be adding some sort of integration test, but wanted to open this to get eyes on the approach first.

Note that two changes are necessary for this to work with terminating gateways, the first is to allow the local proxy that a service uses to route traffic through the mesh to configure the timeout for its routes via makeUpstreamRouteForDiscoveryChain, the second is to have the TerminatingGateway itself configure its routing timeouts via makeNamedDefaultRouteWithLB.

Testing & Reproduction steps

The files I used to manually validate are here. I compiled main.go as timeout-check, copied it to the root of the consul directory and locally compiled consul. I then booted up Consul, ran the timeout-check.sh script and issued a curl to localhost:9877 to kick off a request through the service mesh to the external sleeping service. Without this change I see:

~ curl localhost:9877
upstream request timeout%

With it I see:

~ curl localhost:9877
finished%

PR Checklist

  • updated test coverage
  • external facing docs updated
  • not a security concern

Overview of commits

@andrewstucki andrewstucki requested review from a team and analogue March 7, 2023 21:33
@andrewstucki andrewstucki requested a review from a team as a code owner March 7, 2023 21:33
@github-actions github-actions bot added theme/api Relating to the HTTP API interface theme/envoy/xds Related to Envoy support type/docs Documentation needs to be created/updated/clarified labels Mar 7, 2023
@andrewstucki andrewstucki merged commit ac184c5 into release/1.13.x Mar 7, 2023
@andrewstucki andrewstucki deleted the release-1.13.x-backport-resolver-request-timeout branch March 7, 2023 21:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

theme/api Relating to the HTTP API interface theme/envoy/xds Related to Envoy support type/docs Documentation needs to be created/updated/clarified

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants