router: defer per try timeout until downstream request is done#6643
router: defer per try timeout until downstream request is done#6643snowp merged 6 commits intoenvoyproxy:masterfrom
Conversation
This defers starting the per try timeout timer until onRequestComplete to ensure that it is not started before the global timeout. This ensures that the per try timeout will not take into account the time spent reading the downstream, which should be responsibility of the HCM level timeouts. Signed-off-by: Snow Pettersen <snowp@squareup.com>
Signed-off-by: Snow Pettersen <snowp@squareup.com>
source/common/router/router.h
Outdated
| bool attempting_internal_redirect_with_complete_stream_ : 1; | ||
| // Tracks whether we deferred a per try timeout because the downstream request | ||
| // had not been completed yet. | ||
| bool pending_per_try_timeout_ : 1; |
There was a problem hiding this comment.
let's put this on the UpstreamRequest instead of on the parent, that will hold up better once we have multiple upstream requests
source/common/router/router.cc
Outdated
| response_timeout_->enableTimer(timeout_.global_timeout_); | ||
| } | ||
|
|
||
| if (pending_per_try_timeout_) { |
There was a problem hiding this comment.
this might be better as a for loop over upstream requests to check their pending_per_try_timeout_ flag
Signed-off-by: Snow Pettersen <snowp@squareup.com>
mpuncel
left a comment
There was a problem hiding this comment.
looks good! thanks for taking this
|
/retest |
|
🔨 rebuilding |
mattklein123
left a comment
There was a problem hiding this comment.
Thanks this looks good. This is a subtle change but I think it's probably worth adding a release note also?
/wait
source/common/router/router.h
Outdated
| bool encode_trailers_ : 1; | ||
| // Tracks whether we deferred a per try timeout because the downstream request | ||
| // had not been completed yet. | ||
| bool pending_per_try_timeout_ : 1; |
There was a problem hiding this comment.
nit: I would probably name this something like create_per_try_timeout_on_request_complete_ or something like that to differentiate between waiting for the per try timeout to elapse which is what this sounds like to me.
Signed-off-by: Snow Pettersen <snowp@squareup.com>
mattklein123
left a comment
There was a problem hiding this comment.
LGTM with small typos
/wait
docs/root/intro/version_history.rst
Outdated
| * redis: add support for zpopmax and zpopmin commands. | ||
| * router: added ability to control retry back-off intervals via :ref:`retry policy <envoy_api_msg_route.RetryPolicy.RetryBackOff>`. | ||
| * router: per try timeouts will no longer start before the downstream request has been received | ||
| in full by the router. This ensure that the per try timeout does not account for slow |
docs/root/intro/version_history.rst
Outdated
| * router: added ability to control retry back-off intervals via :ref:`retry policy <envoy_api_msg_route.RetryPolicy.RetryBackOff>`. | ||
| * router: per try timeouts will no longer start before the downstream request has been received | ||
| in full by the router. This ensure that the per try timeout does not account for slow | ||
| downstreams and that will not not start before the global timeout. |
|
/retest |
|
🔨 rebuilding |
* master: thread: remove ThreadFactorySingleton (envoyproxy#6658) router: support offseting downstream provided grpc timeout (envoyproxy#6628) router: defer per try timeout until downstream request is done (envoyproxy#6643) update bazel readme for clang-format-8 on mac (envoyproxy#6660) Implement some TODOs in quic_endian_impl.h (envoyproxy#6644) docs: add aspell to mac dependencies to fix check format script (envoyproxy#6661) config: fix delta xDS's use of (un)subscribe fields, more explicit protocol spec (envoyproxy#6545) Signed-off-by: Michael Puncel <mpuncel@squareup.com>
This defers starting the per try timeout timer until onRequestComplete
to ensure that it is not started before the global timeout. This ensures
that the per try timeout will not take into account the time spent
reading the downstream, which should be responsibility of the HCM level
timeouts.
Signed-off-by: Snow Pettersen snowp@squareup.com
Risk Level: Medium, changes to router logic
Testing: Added new UT, updated existing ones
Docs Changes: n/a
Release Notes: n/a
Fixes #6624