-
Notifications
You must be signed in to change notification settings - Fork 270
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config: Decouple HTTP and TCP buffering config #2078
Conversation
The `FailFast` middleware limits the period of time an inner `Service` may return `Poll::Pending` from its `poll_ready` method without becoming ready. If inner service does not become ready by the end of the failfast timeout, the `FailFast` middleware becomes ready and immediately accepts but fails all requests until the inner service once again becomes ready. This allows `Buffer` around an inner service to be drained when the buffered service has become unavailable. However, an unfortunate consequence of the current `FailFast` design is that it advertises readiness up the stack _past_ a `Buffer` that wraps it. Once the service has entered the failfast state, it will return `Poll::Ready(Ok(()))` for all `poll_ready` calls until the inner service exits failfast. Since a `Buffer` will accept new requests as long as it has queue capacity, without polling the inner service (it is instead polled by the buffer's background worker task), the buffer will continue accepting new requests and immediately failing them as long as the service inside it is in a failfast state. This is not desirable, as it means that a higher level traffic distributor or load balancer will see a `Buffer<FailFast<S>>` as being ready to recieve new requests even when the `S` service is not ready and the `FailFast` middleware will simply fail those requests. It would be preferable for the `Buffer<FailFast<S>>` to return `Poll::Pending` from `poll_ready` when in the failfast state, so that *new* requests are not dispatched to it, but still proactively drain requests already dispatched into the buffer. This branch changes the `FailFast` implementation to consist a _pair_ of middleware, `FailFast` and `Advertise`. The `FailFast` middleware behaves similarly to how it did previously, but is modified so that the inner service's readiness state is shared with the `Advertise` middleware. The `Advertise` service will advertise the inner service's _actual_ readiness in `poll_ready` (e.g. return `Poll::Pending` while in failfast), while the `FailFast` service will return `Poll::Ready(Ok(()))` and failfast all requests it recieves. This way, a `Buffer` (or other middleware) can be placed in between an inner `FailFast` service and an outer `Advertise` service. The `Buffer`'s worker task will see the `Poll::Ready(Ok(()))` returned by the `FailFast` middleware, and drain its queue, while an outer traffic distributor or load balancer will see the `Poll::Pending` returned by `Advertise` (reflecting the actual readiness of the inner service), and dispatch traffic elsewhere. In order to implement this, it was necessary to change the `FailFast` middleware to drive the innermost service to readiness on a background task when it is in the `FailFast` state. This is because a `Buffer`'s worker only polls the inner service's readiness when it has requests to dispatch, rather than in the `Buffer` service's `poll_ready`. Therefore, it's necessary to ensure we proactively drive the readiness of the inner service while in the failfast state, as the `FailFast` service itself will not be polled again once the buffer's queue has drained. Currently, `Buffer`s are always paired with a `tower::spawn_ready::SpawnReady` middleware, which does something analogous, but unconditionally on _all_ `Pending` calls to `poll_ready`. The new implementation has the side benefit of obviating the need for a separate `SpawnReady` middleware, and is a bit more efficient about spawning tasks, since a new task is only spawned when *in the failfast state*, rather than any time the inner service returns `Poll::Pending`. Depends on #2078
Proxies may buffer TCP connections while waiting for policy discovery and may buffer HTTP requests while waiting for a shared resource (like a load balancer). Previously, a single configuration was used to configure both TCP and HTTP buffers. This change decouples these configurations in preparation for allowing balancer configuration to be configured by the control plane. Furthermore, this change updates the stack builder to always construct buffers with failfast and spawnready to (1) ensure that all buffers enforce the proper backpressure and load shedding semantics and (2) to reduce boilerplate. This change updates the proxy's environment configuration as follows: * Removed `LINKERD2_PROXY_INBOUND_DISPATCH_TIMEOUT` * Removed `LINKERD2_PROXY_OUTBOUND_DISPATCH_TIMEOUT` * Added `LINKERD2_PROXY_INBOUND_DISCOVERY_IDLE_TIMEOUT` * Added `LINKERD2_PROXY_OUTBOUND_DISCOVERY_IDLE_TIMEOUT` * Removed `LINKERD2_PROXY_BUFFER_CAPACITY` * Removed `LINKERD2_PROXY_INBOUND_ROUTER_MAX_IDLE_AGE` * Removed `LINKERD2_PROXY_OUTBOUND_ROUTER_MAX_IDLE_AGE` * Added `LINKERD2_PROXY_INBOUND_HTTP_BUFFER_CAPACITY` * Added `LINKERD2_PROXY_INBOUND_HTTP_FAILFAST_TIMEOUT` * Added `LINKERD2_PROXY_OUTBOUND_TCP_BUFFER_CAPACITY` * Added `LINKERD2_PROXY_OUTBOUND_TCP_FAILFAST_TIMEOUT` * Added `LINKERD2_PROXY_OUTBOUND_HTTP_BUFFER_CAPACITY` * Added `LINKERD2_PROXY_OUTBOUND_HTTP_FAILFAST_TIMEOUT` * Added `LINKERD2_PROXY_OUTBOUND_HTTP1_CONNECTION_POOL_IDLE_TIMEOUT`
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall, this looks good to me! i had some minor nits about naming and documentation, but other than that, this looks great.
are we going to need to make a proxy-injector change to set the new env vars, as well?
The proxy injector doesn't (yet) provide configuration related to the changed values:
|
The `FailFast` middleware limits the period of time an inner `Service` may return `Poll::Pending` from its `poll_ready` method without becoming ready. If inner service does not become ready by the end of the failfast timeout, the `FailFast` middleware becomes ready and immediately accepts but fails all requests until the inner service once again becomes ready. This allows `Buffer` around an inner service to be drained when the buffered service has become unavailable. However, an unfortunate consequence of the current `FailFast` design is that it advertises readiness up the stack _past_ a `Buffer` that wraps it. Once the service has entered the failfast state, it will return `Poll::Ready(Ok(()))` for all `poll_ready` calls until the inner service exits failfast. Since a `Buffer` will accept new requests as long as it has queue capacity, without polling the inner service (it is instead polled by the buffer's background worker task), the buffer will continue accepting new requests and immediately failing them as long as the service inside it is in a failfast state. This is not desirable, as it means that a higher level traffic distributor or load balancer will see a `Buffer<FailFast<S>>` as being ready to recieve new requests even when the `S` service is not ready and the `FailFast` middleware will simply fail those requests. It would be preferable for the `Buffer<FailFast<S>>` to return `Poll::Pending` from `poll_ready` when in the failfast state, so that *new* requests are not dispatched to it, but still proactively drain requests already dispatched into the buffer. This branch changes the `FailFast` implementation to consist a _pair_ of middleware, `FailFast` and `Advertise`. The `FailFast` middleware behaves similarly to how it did previously, but is modified so that the inner service's readiness state is shared with the `Advertise` middleware. The `Advertise` service will advertise the inner service's _actual_ readiness in `poll_ready` (e.g. return `Poll::Pending` while in failfast), while the `FailFast` service will return `Poll::Ready(Ok(()))` and failfast all requests it recieves. This way, a `Buffer` (or other middleware) can be placed in between an inner `FailFast` service and an outer `Advertise` service. The `Buffer`'s worker task will see the `Poll::Ready(Ok(()))` returned by the `FailFast` middleware, and drain its queue, while an outer traffic distributor or load balancer will see the `Poll::Pending` returned by `Advertise` (reflecting the actual readiness of the inner service), and dispatch traffic elsewhere. In order to implement this, it was necessary to change the `FailFast` middleware to drive the innermost service to readiness on a background task when it is in the `FailFast` state. This is because a `Buffer`'s worker only polls the inner service's readiness when it has requests to dispatch, rather than in the `Buffer` service's `poll_ready`. Therefore, it's necessary to ensure we proactively drive the readiness of the inner service while in the failfast state, as the `FailFast` service itself will not be polled again once the buffer's queue has drained. Currently, `Buffer`s are always paired with a `tower::spawn_ready::SpawnReady` middleware, which does something analogous, but unconditionally on _all_ `Pending` calls to `poll_ready`. The new implementation has the side benefit of obviating the need for a separate `SpawnReady` middleware, and is a bit more efficient about spawning tasks, since a new task is only spawned when *in the failfast state*, rather than any time the inner service returns `Poll::Pending`. Depends on #2078
This release updates the outbound proxy to use a queue for each load balancer, instead of one for each router. This allows us to remove unnecessary caching and buffering behavior in other places. Routes are now lazily initialized so that service profile routes will not show up in metrics until the route is used. Furthermore, a default route is always available now (i.e. when no service profile exists for a service). Furthermore, the proxy's traffic splitting behavior has changed so that only available concrete services (i.e. those not in failfast) are used. This lets the proxy manage failover-like use cases without external coordination. This release also features an update to Tokio v1.24, which promises to reduce CPU usage, especially for the proxy's pod-local communication. --- * Allow Unicode-dfs-2016 for unicode-ident (linkerd/linkerd2-proxy#1973) * build(deps): bump unicode-ident from 1.0.1 to 1.0.5 (linkerd/linkerd2-proxy#1964) * build(deps): bump tj-actions/changed-files from 34.3.4 to 34.4.0 (linkerd/linkerd2-proxy#1986) * build(deps): bump tower-layer from 0.3.1 to 0.3.2 (linkerd/linkerd2-proxy#1987) * build(deps): bump thiserror from 1.0.34 to 1.0.37 (linkerd/linkerd2-proxy#1988) * build(deps): bump itoa from 1.0.2 to 1.0.4 (linkerd/linkerd2-proxy#1989) * build(deps): bump tokio from 1.21.0 to 1.21.2 (linkerd/linkerd2-proxy#1990) * build(deps): bump regex from 1.6.0 to 1.7.0 (linkerd/linkerd2-proxy#1991) * build(deps): bump tj-actions/changed-files from 34.4.0 to 34.4.2 (linkerd/linkerd2-proxy#1993) * build(deps): bump cmake from 0.1.48 to 0.1.49 (linkerd/linkerd2-proxy#1994) * build(deps): bump libc from 0.2.132 to 0.2.137 (linkerd/linkerd2-proxy#1995) * build(deps): bump parking_lot_core from 0.9.3 to 0.9.4 (linkerd/linkerd2-proxy#1996) * build(deps): bump hdrhistogram from 7.5.1 to 7.5.2 (linkerd/linkerd2-proxy#1999) * build(deps): bump tracing-subscriber from 0.3.15 to 0.3.16 (linkerd/linkerd2-proxy#1998) * build(deps): bump serde from 1.0.144 to 1.0.147 (linkerd/linkerd2-proxy#1997) * build(deps): bump EmbarkStudios/cargo-deny-action from 1.3.2 to 1.4.0 (linkerd/linkerd2-proxy#2000) * build(deps): bump tonic from 0.8.1 to 0.8.2 (linkerd/linkerd2-proxy#2002) * build(deps): bump rand_core from 0.6.3 to 0.6.4 (linkerd/linkerd2-proxy#2003) * build(deps): bump derive_arbitrary from 1.1.6 to 1.2.0 (linkerd/linkerd2-proxy#2004) * build(deps): bump tj-actions/changed-files from 34.4.2 to 34.4.4 (linkerd/linkerd2-proxy#2005) * build(deps): bump ppv-lite86 from 0.2.16 to 0.2.17 (linkerd/linkerd2-proxy#2006) * build(deps): bump prost from 0.11.0 to 0.11.2 (linkerd/linkerd2-proxy#2007) * build(deps): bump async-trait from 0.1.57 to 0.1.58 (linkerd/linkerd2-proxy#2008) * build(deps): bump getrandom from 0.2.7 to 0.2.8 (linkerd/linkerd2-proxy#2009) * build(deps): bump base64 from 0.13.0 to 0.13.1 (linkerd/linkerd2-proxy#2010) * build(deps): bump anyhow from 1.0.65 to 1.0.66 (linkerd/linkerd2-proxy#2011) * build(deps): bump tj-actions/changed-files from 34.4.4 to 34.5.0 (linkerd/linkerd2-proxy#2012) * build(deps): bump clang-sys from 1.3.3 to 1.4.0 (linkerd/linkerd2-proxy#2013) * build(deps): bump ipnet from 2.5.0 to 2.5.1 (linkerd/linkerd2-proxy#2015) * build(deps): bump prost-types from 0.11.1 to 0.11.2 (linkerd/linkerd2-proxy#2014) * meshtls-rustls: fix clippy `.ok().expect()` lints in tests (linkerd/linkerd2-proxy#2017) * build(deps): bump tokio from 1.21.2 to 1.22.0 (linkerd/linkerd2-proxy#2020) * build(deps): bump prost-build from 0.11.1 to 0.11.3 (linkerd/linkerd2-proxy#2018) * build(deps): bump futures from 0.3.24 to 0.3.25 (linkerd/linkerd2-proxy#2019) * build(deps): bump tokio-boring from 2.1.4 to 2.1.5 (linkerd/linkerd2-proxy#2024) * build(deps): bump DavidAnson/markdownlint-cli2-action (linkerd/linkerd2-proxy#2022) * build(deps): bump once_cell from 1.14.0 to 1.16.0 (linkerd/linkerd2-proxy#2023) * build(deps): bump serde from 1.0.147 to 1.0.148 (linkerd/linkerd2-proxy#2025) * build(deps): bump tracing from 0.1.36 to 0.1.37 (linkerd/linkerd2-proxy#2026) * build(deps): bump bytes from 1.2.1 to 1.3.0 (linkerd/linkerd2-proxy#2027) * build(deps): bump mio from 0.8.4 to 0.8.5 (linkerd/linkerd2-proxy#2028) * build(deps): bump softprops/action-gh-release from 0.1.14 to 0.1.15 (linkerd/linkerd2-proxy#2030) * build(deps): bump tonic-build from 0.8.2 to 0.8.4 (linkerd/linkerd2-proxy#2031) * build(deps): bump parking_lot_core from 0.9.4 to 0.9.5 (linkerd/linkerd2-proxy#2032) * build(deps): bump libloading from 0.7.3 to 0.7.4 (linkerd/linkerd2-proxy#2033) * build(deps): bump boring from 2.0.0 to 2.1.0 (linkerd/linkerd2-proxy#2036) * build(deps): bump async-trait from 0.1.58 to 0.1.59 (linkerd/linkerd2-proxy#2037) * build(deps): bump libc from 0.2.137 to 0.2.138 (linkerd/linkerd2-proxy#2038) * build(deps): bump tj-actions/changed-files from 34.5.0 to 34.5.1 (linkerd/linkerd2-proxy#2040) * build(deps): bump indexmap from 1.9.1 to 1.9.2 (linkerd/linkerd2-proxy#2041) * build(deps): bump aho-corasick from 0.7.19 to 0.7.20 (linkerd/linkerd2-proxy#2042) * build(deps): bump jemalloc-sys (linkerd/linkerd2-proxy#2043) * build(deps): bump boring-sys from 2.0.0 to 2.1.0 (linkerd/linkerd2-proxy#1948) * just: Fix justfile command silencing (linkerd/linkerd2-proxy#2016) * build(deps): bump regex-syntax from 0.6.27 to 0.6.28 (linkerd/linkerd2-proxy#2044) * build(deps): bump data-encoding from 2.3.2 to 2.3.3 (linkerd/linkerd2-proxy#2046) * build(deps): bump tokio-macros from 1.8.0 to 1.8.2 (linkerd/linkerd2-proxy#2047) * build(deps): bump serde_json from 1.0.85 to 1.0.89 (linkerd/linkerd2-proxy#2045) * build(deps): bump flate2 from 1.0.24 to 1.0.25 (linkerd/linkerd2-proxy#2051) * build(deps): bump tonic from 0.8.2 to 0.8.3 (linkerd/linkerd2-proxy#2052) * dev: v37 (linkerd/linkerd2-proxy#2048) * build(deps): bump itertools from 0.10.3 to 0.10.5 (linkerd/linkerd2-proxy#2049) * build(deps): bump syn from 1.0.103 to 1.0.105 (linkerd/linkerd2-proxy#2056) * build(deps): bump prost from 0.11.2 to 0.11.3 (linkerd/linkerd2-proxy#2055) * build(deps): bump serde from 1.0.148 to 1.0.149 (linkerd/linkerd2-proxy#2054) * build(deps): bump cc from 1.0.73 to 1.0.77 (linkerd/linkerd2-proxy#2053) * build(deps): bump linkerd/dev from 37 to 38 (linkerd/linkerd2-proxy#2058) * build(deps): bump tj-actions/changed-files from 34.5.1 to 34.5.3 (linkerd/linkerd2-proxy#2059) * build(deps): bump tokio from 1.22.0 to 1.23.0 (linkerd/linkerd2-proxy#2060) * build(deps): bump derive_arbitrary from 1.2.0 to 1.2.1 (linkerd/linkerd2-proxy#2061) * build(deps): bump serde from 1.0.149 to 1.0.150 (linkerd/linkerd2-proxy#2062) * build(deps): bump prost-build from 0.11.3 to 0.11.4 (linkerd/linkerd2-proxy#2063) * release: Produce static binaries (linkerd/linkerd2-proxy#2057) * build(deps): bump ipnet from 2.5.1 to 2.7.0 (linkerd/linkerd2-proxy#2066) * build(deps): bump tj-actions/changed-files from 34.5.3 to 34.6.1 (linkerd/linkerd2-proxy#2068) * build(deps): bump cc from 1.0.77 to 1.0.78 (linkerd/linkerd2-proxy#2069) * build(deps): bump actions/checkout from 3.1.0 to 3.2.0 (linkerd/linkerd2-proxy#2064) * build(deps): bump unicode-ident from 1.0.5 to 1.0.6 (linkerd/linkerd2-proxy#2072) * build(deps): bump ryu from 1.0.10 to 1.0.12 (linkerd/linkerd2-proxy#2073) * build(deps): bump async-trait from 0.1.59 to 0.1.60 (linkerd/linkerd2-proxy#2074) * build(deps): bump thiserror from 1.0.37 to 1.0.38 (linkerd/linkerd2-proxy#2075) * build(deps): bump tj-actions/changed-files from 34.6.1 to 35.1.0 (linkerd/linkerd2-proxy#2077) * build(deps): bump quote from 1.0.20 to 1.0.23 (linkerd/linkerd2-proxy#2081) * build(deps): bump proc-macro2 from 1.0.47 to 1.0.49 (linkerd/linkerd2-proxy#2082) * build(deps): bump num_cpus from 1.14.0 to 1.15.0 (linkerd/linkerd2-proxy#2083) * build(deps): bump itoa from 1.0.4 to 1.0.5 (linkerd/linkerd2-proxy#2084) * Introduce a 'distribute' stack module (linkerd/linkerd2-proxy#2085) * outbound: Split the concrete and logical stack builders (linkerd/linkerd2-proxy#2092) * config: Decouple HTTP and TCP buffering config (linkerd/linkerd2-proxy#2078) * build(deps): bump syn from 1.0.105 to 1.0.107 (linkerd/linkerd2-proxy#2088) * build(deps): bump anyhow from 1.0.66 to 1.0.68 (linkerd/linkerd2-proxy#2089) * build(deps): bump prost from 0.11.3 to 0.11.5 (linkerd/linkerd2-proxy#2090) * Propagate backpressure from buffers when in failfast (linkerd/linkerd2-proxy#2091) * Split `outbound::tcp::logical::tests` into a file (linkerd/linkerd2-proxy#2096) * build(deps): bump prost-types from 0.11.2 to 0.11.5 (linkerd/linkerd2-proxy#2099) * build(deps): bump libc from 0.2.138 to 0.2.139 (linkerd/linkerd2-proxy#2098) * build(deps): bump serde from 1.0.150 to 1.0.152 (linkerd/linkerd2-proxy#2097) * stack: Add `SpawnWatch` middleware (linkerd/linkerd2-proxy#2101) * build(deps): bump prost-build from 0.11.4 to 0.11.5 (linkerd/linkerd2-proxy#2087) * build(deps): bump prettyplease from 0.1.21 to 0.1.22 (linkerd/linkerd2-proxy#2104) * build(deps): bump once_cell from 1.16.0 to 1.17.0 (linkerd/linkerd2-proxy#2105) * build(deps): bump serde_json from 1.0.89 to 1.0.91 (linkerd/linkerd2-proxy#2106) * build(deps): bump DavidAnson/markdownlint-cli2-action (linkerd/linkerd2-proxy#2114) * stack: add `Lazy` middleware (linkerd/linkerd2-proxy#2102) * build(deps): bump derive_arbitrary from 1.2.1 to 1.2.2 (linkerd/linkerd2-proxy#2116) * build(deps): bump arbitrary from 1.2.0 to 1.2.2 (linkerd/linkerd2-proxy#2117) * Rename `linkerd-cache` to `linkerd-idle-cache` (linkerd/linkerd2-proxy#2118) * Rename Stack::push_cache to push_idle_cache (linkerd/linkerd2-proxy#2119) * Make all comment delimeters uniform (linkerd/linkerd2-proxy#2120) * Make NameAddr cheaper to clone (linkerd/linkerd2-proxy#2121) * build(deps): bump tokio from 1.23.0 to 1.23.1 (linkerd/linkerd2-proxy#2125) * build(deps): bump tj-actions/changed-files from 35.1.0 to 35.3.1 (linkerd/linkerd2-proxy#2124) * distribute: Add a backend cache (linkerd/linkerd2-proxy#2122) * stack: Eliminate the `UpdateWatch` trait (linkerd/linkerd2-proxy#2123) * Make `Profile::clone` cheaper (linkerd/linkerd2-proxy#2127) * build(deps): bump actions/download-artifact from 3.0.1 to 3.0.2 (linkerd/linkerd2-proxy#2131) * build(deps): bump tokio from 1.23.1 to 1.24.0 (linkerd/linkerd2-proxy#2132) * build(deps): bump prettyplease from 0.1.22 to 0.1.23 (linkerd/linkerd2-proxy#2133) * Support router-scoped caches (linkerd/linkerd2-proxy#2128) * stack: Fix `NewSpawnWatch::layer` type signature (linkerd/linkerd2-proxy#2134) * Implement `Hash` for configuration types (linkerd/linkerd2-proxy#2135) * outbound: separate TCP logical and concrete stacks (linkerd/linkerd2-proxy#2136) * build(deps): bump ipnet from 2.7.0 to 2.7.1 (linkerd/linkerd2-proxy#2141) * build(deps): bump glob from 0.3.0 to 0.3.1 (linkerd/linkerd2-proxy#2140) * build(deps): bump async-trait from 0.1.60 to 0.1.61 (linkerd/linkerd2-proxy#2138) * build(deps): bump actions/upload-artifact from 3.1.1 to 3.1.2 (linkerd/linkerd2-proxy#2137) * Update routers to support per-request backend distributions (linkerd/linkerd2-proxy#2095) * Disable musl in release build (linkerd/linkerd2-proxy#2143) Signed-off-by: Oliver Gould <[email protected]>
* proxy: v2.189.0 This release updates the outbound proxy to use a queue for each load balancer, instead of one for each router. This allows us to remove unnecessary caching and buffering behavior in other places. Routes are now lazily initialized so that service profile routes will not show up in metrics until the route is used. Furthermore, a default route is always available now (i.e. when no service profile exists for a service). Furthermore, the proxy's traffic splitting behavior has changed so that only available concrete services (i.e. those not in failfast) are used. This lets the proxy manage failover-like use cases without external coordination. This release also features an update to Tokio v1.24, which promises to reduce CPU usage, especially for the proxy's pod-local communication. --- * Allow Unicode-dfs-2016 for unicode-ident (linkerd/linkerd2-proxy#1973) * build(deps): bump unicode-ident from 1.0.1 to 1.0.5 (linkerd/linkerd2-proxy#1964) * build(deps): bump tj-actions/changed-files from 34.3.4 to 34.4.0 (linkerd/linkerd2-proxy#1986) * build(deps): bump tower-layer from 0.3.1 to 0.3.2 (linkerd/linkerd2-proxy#1987) * build(deps): bump thiserror from 1.0.34 to 1.0.37 (linkerd/linkerd2-proxy#1988) * build(deps): bump itoa from 1.0.2 to 1.0.4 (linkerd/linkerd2-proxy#1989) * build(deps): bump tokio from 1.21.0 to 1.21.2 (linkerd/linkerd2-proxy#1990) * build(deps): bump regex from 1.6.0 to 1.7.0 (linkerd/linkerd2-proxy#1991) * build(deps): bump tj-actions/changed-files from 34.4.0 to 34.4.2 (linkerd/linkerd2-proxy#1993) * build(deps): bump cmake from 0.1.48 to 0.1.49 (linkerd/linkerd2-proxy#1994) * build(deps): bump libc from 0.2.132 to 0.2.137 (linkerd/linkerd2-proxy#1995) * build(deps): bump parking_lot_core from 0.9.3 to 0.9.4 (linkerd/linkerd2-proxy#1996) * build(deps): bump hdrhistogram from 7.5.1 to 7.5.2 (linkerd/linkerd2-proxy#1999) * build(deps): bump tracing-subscriber from 0.3.15 to 0.3.16 (linkerd/linkerd2-proxy#1998) * build(deps): bump serde from 1.0.144 to 1.0.147 (linkerd/linkerd2-proxy#1997) * build(deps): bump EmbarkStudios/cargo-deny-action from 1.3.2 to 1.4.0 (linkerd/linkerd2-proxy#2000) * build(deps): bump tonic from 0.8.1 to 0.8.2 (linkerd/linkerd2-proxy#2002) * build(deps): bump rand_core from 0.6.3 to 0.6.4 (linkerd/linkerd2-proxy#2003) * build(deps): bump derive_arbitrary from 1.1.6 to 1.2.0 (linkerd/linkerd2-proxy#2004) * build(deps): bump tj-actions/changed-files from 34.4.2 to 34.4.4 (linkerd/linkerd2-proxy#2005) * build(deps): bump ppv-lite86 from 0.2.16 to 0.2.17 (linkerd/linkerd2-proxy#2006) * build(deps): bump prost from 0.11.0 to 0.11.2 (linkerd/linkerd2-proxy#2007) * build(deps): bump async-trait from 0.1.57 to 0.1.58 (linkerd/linkerd2-proxy#2008) * build(deps): bump getrandom from 0.2.7 to 0.2.8 (linkerd/linkerd2-proxy#2009) * build(deps): bump base64 from 0.13.0 to 0.13.1 (linkerd/linkerd2-proxy#2010) * build(deps): bump anyhow from 1.0.65 to 1.0.66 (linkerd/linkerd2-proxy#2011) * build(deps): bump tj-actions/changed-files from 34.4.4 to 34.5.0 (linkerd/linkerd2-proxy#2012) * build(deps): bump clang-sys from 1.3.3 to 1.4.0 (linkerd/linkerd2-proxy#2013) * build(deps): bump ipnet from 2.5.0 to 2.5.1 (linkerd/linkerd2-proxy#2015) * build(deps): bump prost-types from 0.11.1 to 0.11.2 (linkerd/linkerd2-proxy#2014) * meshtls-rustls: fix clippy `.ok().expect()` lints in tests (linkerd/linkerd2-proxy#2017) * build(deps): bump tokio from 1.21.2 to 1.22.0 (linkerd/linkerd2-proxy#2020) * build(deps): bump prost-build from 0.11.1 to 0.11.3 (linkerd/linkerd2-proxy#2018) * build(deps): bump futures from 0.3.24 to 0.3.25 (linkerd/linkerd2-proxy#2019) * build(deps): bump tokio-boring from 2.1.4 to 2.1.5 (linkerd/linkerd2-proxy#2024) * build(deps): bump DavidAnson/markdownlint-cli2-action (linkerd/linkerd2-proxy#2022) * build(deps): bump once_cell from 1.14.0 to 1.16.0 (linkerd/linkerd2-proxy#2023) * build(deps): bump serde from 1.0.147 to 1.0.148 (linkerd/linkerd2-proxy#2025) * build(deps): bump tracing from 0.1.36 to 0.1.37 (linkerd/linkerd2-proxy#2026) * build(deps): bump bytes from 1.2.1 to 1.3.0 (linkerd/linkerd2-proxy#2027) * build(deps): bump mio from 0.8.4 to 0.8.5 (linkerd/linkerd2-proxy#2028) * build(deps): bump softprops/action-gh-release from 0.1.14 to 0.1.15 (linkerd/linkerd2-proxy#2030) * build(deps): bump tonic-build from 0.8.2 to 0.8.4 (linkerd/linkerd2-proxy#2031) * build(deps): bump parking_lot_core from 0.9.4 to 0.9.5 (linkerd/linkerd2-proxy#2032) * build(deps): bump libloading from 0.7.3 to 0.7.4 (linkerd/linkerd2-proxy#2033) * build(deps): bump boring from 2.0.0 to 2.1.0 (linkerd/linkerd2-proxy#2036) * build(deps): bump async-trait from 0.1.58 to 0.1.59 (linkerd/linkerd2-proxy#2037) * build(deps): bump libc from 0.2.137 to 0.2.138 (linkerd/linkerd2-proxy#2038) * build(deps): bump tj-actions/changed-files from 34.5.0 to 34.5.1 (linkerd/linkerd2-proxy#2040) * build(deps): bump indexmap from 1.9.1 to 1.9.2 (linkerd/linkerd2-proxy#2041) * build(deps): bump aho-corasick from 0.7.19 to 0.7.20 (linkerd/linkerd2-proxy#2042) * build(deps): bump jemalloc-sys (linkerd/linkerd2-proxy#2043) * build(deps): bump boring-sys from 2.0.0 to 2.1.0 (linkerd/linkerd2-proxy#1948) * just: Fix justfile command silencing (linkerd/linkerd2-proxy#2016) * build(deps): bump regex-syntax from 0.6.27 to 0.6.28 (linkerd/linkerd2-proxy#2044) * build(deps): bump data-encoding from 2.3.2 to 2.3.3 (linkerd/linkerd2-proxy#2046) * build(deps): bump tokio-macros from 1.8.0 to 1.8.2 (linkerd/linkerd2-proxy#2047) * build(deps): bump serde_json from 1.0.85 to 1.0.89 (linkerd/linkerd2-proxy#2045) * build(deps): bump flate2 from 1.0.24 to 1.0.25 (linkerd/linkerd2-proxy#2051) * build(deps): bump tonic from 0.8.2 to 0.8.3 (linkerd/linkerd2-proxy#2052) * dev: v37 (linkerd/linkerd2-proxy#2048) * build(deps): bump itertools from 0.10.3 to 0.10.5 (linkerd/linkerd2-proxy#2049) * build(deps): bump syn from 1.0.103 to 1.0.105 (linkerd/linkerd2-proxy#2056) * build(deps): bump prost from 0.11.2 to 0.11.3 (linkerd/linkerd2-proxy#2055) * build(deps): bump serde from 1.0.148 to 1.0.149 (linkerd/linkerd2-proxy#2054) * build(deps): bump cc from 1.0.73 to 1.0.77 (linkerd/linkerd2-proxy#2053) * build(deps): bump linkerd/dev from 37 to 38 (linkerd/linkerd2-proxy#2058) * build(deps): bump tj-actions/changed-files from 34.5.1 to 34.5.3 (linkerd/linkerd2-proxy#2059) * build(deps): bump tokio from 1.22.0 to 1.23.0 (linkerd/linkerd2-proxy#2060) * build(deps): bump derive_arbitrary from 1.2.0 to 1.2.1 (linkerd/linkerd2-proxy#2061) * build(deps): bump serde from 1.0.149 to 1.0.150 (linkerd/linkerd2-proxy#2062) * build(deps): bump prost-build from 0.11.3 to 0.11.4 (linkerd/linkerd2-proxy#2063) * release: Produce static binaries (linkerd/linkerd2-proxy#2057) * build(deps): bump ipnet from 2.5.1 to 2.7.0 (linkerd/linkerd2-proxy#2066) * build(deps): bump tj-actions/changed-files from 34.5.3 to 34.6.1 (linkerd/linkerd2-proxy#2068) * build(deps): bump cc from 1.0.77 to 1.0.78 (linkerd/linkerd2-proxy#2069) * build(deps): bump actions/checkout from 3.1.0 to 3.2.0 (linkerd/linkerd2-proxy#2064) * build(deps): bump unicode-ident from 1.0.5 to 1.0.6 (linkerd/linkerd2-proxy#2072) * build(deps): bump ryu from 1.0.10 to 1.0.12 (linkerd/linkerd2-proxy#2073) * build(deps): bump async-trait from 0.1.59 to 0.1.60 (linkerd/linkerd2-proxy#2074) * build(deps): bump thiserror from 1.0.37 to 1.0.38 (linkerd/linkerd2-proxy#2075) * build(deps): bump tj-actions/changed-files from 34.6.1 to 35.1.0 (linkerd/linkerd2-proxy#2077) * build(deps): bump quote from 1.0.20 to 1.0.23 (linkerd/linkerd2-proxy#2081) * build(deps): bump proc-macro2 from 1.0.47 to 1.0.49 (linkerd/linkerd2-proxy#2082) * build(deps): bump num_cpus from 1.14.0 to 1.15.0 (linkerd/linkerd2-proxy#2083) * build(deps): bump itoa from 1.0.4 to 1.0.5 (linkerd/linkerd2-proxy#2084) * Introduce a 'distribute' stack module (linkerd/linkerd2-proxy#2085) * outbound: Split the concrete and logical stack builders (linkerd/linkerd2-proxy#2092) * config: Decouple HTTP and TCP buffering config (linkerd/linkerd2-proxy#2078) * build(deps): bump syn from 1.0.105 to 1.0.107 (linkerd/linkerd2-proxy#2088) * build(deps): bump anyhow from 1.0.66 to 1.0.68 (linkerd/linkerd2-proxy#2089) * build(deps): bump prost from 0.11.3 to 0.11.5 (linkerd/linkerd2-proxy#2090) * Propagate backpressure from buffers when in failfast (linkerd/linkerd2-proxy#2091) * Split `outbound::tcp::logical::tests` into a file (linkerd/linkerd2-proxy#2096) * build(deps): bump prost-types from 0.11.2 to 0.11.5 (linkerd/linkerd2-proxy#2099) * build(deps): bump libc from 0.2.138 to 0.2.139 (linkerd/linkerd2-proxy#2098) * build(deps): bump serde from 1.0.150 to 1.0.152 (linkerd/linkerd2-proxy#2097) * stack: Add `SpawnWatch` middleware (linkerd/linkerd2-proxy#2101) * build(deps): bump prost-build from 0.11.4 to 0.11.5 (linkerd/linkerd2-proxy#2087) * build(deps): bump prettyplease from 0.1.21 to 0.1.22 (linkerd/linkerd2-proxy#2104) * build(deps): bump once_cell from 1.16.0 to 1.17.0 (linkerd/linkerd2-proxy#2105) * build(deps): bump serde_json from 1.0.89 to 1.0.91 (linkerd/linkerd2-proxy#2106) * build(deps): bump DavidAnson/markdownlint-cli2-action (linkerd/linkerd2-proxy#2114) * stack: add `Lazy` middleware (linkerd/linkerd2-proxy#2102) * build(deps): bump derive_arbitrary from 1.2.1 to 1.2.2 (linkerd/linkerd2-proxy#2116) * build(deps): bump arbitrary from 1.2.0 to 1.2.2 (linkerd/linkerd2-proxy#2117) * Rename `linkerd-cache` to `linkerd-idle-cache` (linkerd/linkerd2-proxy#2118) * Rename Stack::push_cache to push_idle_cache (linkerd/linkerd2-proxy#2119) * Make all comment delimeters uniform (linkerd/linkerd2-proxy#2120) * Make NameAddr cheaper to clone (linkerd/linkerd2-proxy#2121) * build(deps): bump tokio from 1.23.0 to 1.23.1 (linkerd/linkerd2-proxy#2125) * build(deps): bump tj-actions/changed-files from 35.1.0 to 35.3.1 (linkerd/linkerd2-proxy#2124) * distribute: Add a backend cache (linkerd/linkerd2-proxy#2122) * stack: Eliminate the `UpdateWatch` trait (linkerd/linkerd2-proxy#2123) * Make `Profile::clone` cheaper (linkerd/linkerd2-proxy#2127) * build(deps): bump actions/download-artifact from 3.0.1 to 3.0.2 (linkerd/linkerd2-proxy#2131) * build(deps): bump tokio from 1.23.1 to 1.24.0 (linkerd/linkerd2-proxy#2132) * build(deps): bump prettyplease from 0.1.22 to 0.1.23 (linkerd/linkerd2-proxy#2133) * Support router-scoped caches (linkerd/linkerd2-proxy#2128) * stack: Fix `NewSpawnWatch::layer` type signature (linkerd/linkerd2-proxy#2134) * Implement `Hash` for configuration types (linkerd/linkerd2-proxy#2135) * outbound: separate TCP logical and concrete stacks (linkerd/linkerd2-proxy#2136) * build(deps): bump ipnet from 2.7.0 to 2.7.1 (linkerd/linkerd2-proxy#2141) * build(deps): bump glob from 0.3.0 to 0.3.1 (linkerd/linkerd2-proxy#2140) * build(deps): bump async-trait from 0.1.60 to 0.1.61 (linkerd/linkerd2-proxy#2138) * build(deps): bump actions/upload-artifact from 3.1.1 to 3.1.2 (linkerd/linkerd2-proxy#2137) * Update routers to support per-request backend distributions (linkerd/linkerd2-proxy#2095) * Disable musl in release build (linkerd/linkerd2-proxy#2143) * Fix proxy scripts for new artifact format
In 2.13, the default inbound and outbound HTTP request queue capacity decreased from 10,000 requests to 100 requests (in PR #2078). This change results in proxies shedding load much more aggressively while under high load to a single destination service, resulting in increased error rates in comparison to 2.12 (see linkerd/linkerd2#11055 for details). This commit changes the default HTTP request queue capacities for the inbound and outbound proxies back to 10,000 requests, the way they were in 2.12 and earlier. In manual load testing I've verified that increasing the queue capacity results in a substantial decrease in 503 Service Unavailable errors emitted by the proxy: with a queue capacity of 100 requests, the load test described [here] observed a failure rate of 51.51% of requests, while with a queue capacity of 10,000 requests, the same load test observes no failures. Note that I did not modify the TCP connection queue capacities, or the control plane request queue capacity. These were previously configured by the same variable before #2078, but were split out into separate vars in that change. I don't think the queue capacity limits for TCP connection establishment or for control plane requests are currently resulting in instability the way the decreased request queue capacity is, so I decided to make a more focused change to just the HTTP request queues for the proxies. [here]: linkerd/linkerd2#11055 (comment)
In 2.13, the default inbound and outbound HTTP request queue capacity decreased from 10,000 requests to 100 requests (in PR #2078). This change results in proxies shedding load much more aggressively while under high load to a single destination service, resulting in increased error rates in comparison to 2.12 (see linkerd/linkerd2#11055 for details). This commit changes the default HTTP request queue capacities for the inbound and outbound proxies back to 10,000 requests, the way they were in 2.12 and earlier. In manual load testing I've verified that increasing the queue capacity results in a substantial decrease in 503 Service Unavailable errors emitted by the proxy: with a queue capacity of 100 requests, the load test described [here] observed a failure rate of 51.51% of requests, while with a queue capacity of 10,000 requests, the same load test observes no failures. Note that I did not modify the TCP connection queue capacities, or the control plane request queue capacity. These were previously configured by the same variable before #2078, but were split out into separate vars in that change. I don't think the queue capacity limits for TCP connection establishment or for control plane requests are currently resulting in instability the way the decreased request queue capacity is, so I decided to make a more focused change to just the HTTP request queues for the proxies. [here]: linkerd/linkerd2#11055 (comment)
In 2.13, the default inbound and outbound HTTP request queue capacity decreased from 10,000 requests to 100 requests (in PR #2078). This change results in proxies shedding load much more aggressively while under high load to a single destination service, resulting in increased error rates in comparison to 2.12 (see linkerd/linkerd2#11055 for details). This commit changes the default HTTP request queue capacities for the inbound and outbound proxies back to 10,000 requests, the way they were in 2.12 and earlier. In manual load testing I've verified that increasing the queue capacity results in a substantial decrease in 503 Service Unavailable errors emitted by the proxy: with a queue capacity of 100 requests, the load test described [here] observed a failure rate of 51.51% of requests, while with a queue capacity of 10,000 requests, the same load test observes no failures. Note that I did not modify the TCP connection queue capacities, or the control plane request queue capacity. These were previously configured by the same variable before #2078, but were split out into separate vars in that change. I don't think the queue capacity limits for TCP connection establishment or for control plane requests are currently resulting in instability the way the decreased request queue capacity is, so I decided to make a more focused change to just the HTTP request queues for the proxies. [here]: linkerd/linkerd2#11055 (comment)
In 2.13, the default inbound and outbound HTTP request queue capacity decreased from 10,000 requests to 100 requests (in PR #2078). This change results in proxies shedding load much more aggressively while under high load to a single destination service, resulting in increased error rates in comparison to 2.12 (see linkerd/linkerd2#11055 for details). This commit changes the default HTTP request queue capacities for the inbound and outbound proxies back to 10,000 requests, the way they were in 2.12 and earlier. In manual load testing I've verified that increasing the queue capacity results in a substantial decrease in 503 Service Unavailable errors emitted by the proxy: with a queue capacity of 100 requests, the load test described [here] observed a failure rate of 51.51% of requests, while with a queue capacity of 10,000 requests, the same load test observes no failures. Note that I did not modify the TCP connection queue capacities, or the control plane request queue capacity. These were previously configured by the same variable before #2078, but were split out into separate vars in that change. I don't think the queue capacity limits for TCP connection establishment or for control plane requests are currently resulting in instability the way the decreased request queue capacity is, so I decided to make a more focused change to just the HTTP request queues for the proxies. [here]: linkerd/linkerd2#11055 (comment)
Proxies may buffer TCP connections while waiting for policy discovery
and may buffer HTTP requests while waiting for a shared resource (like a
load balancer).
Previously, a single configuration was used to configure both TCP and
HTTP buffers. This change decouples these configurations in preparation
for allowing balancer configuration to be configured by the control
plane.
Furthermore, this change updates the stack builder to always construct
buffers with failfast and spawnready to (1) ensure that all buffers
enforce the proper backpressure and load shedding semantics and (2) to
reduce boilerplate.
This change updates the proxy's environment configuration as follows:
LINKERD2_PROXY_INBOUND_DISPATCH_TIMEOUT
LINKERD2_PROXY_OUTBOUND_DISPATCH_TIMEOUT
LINKERD2_PROXY_INBOUND_DISCOVERY_IDLE_TIMEOUT
LINKERD2_PROXY_OUTBOUND_DISCOVERY_IDLE_TIMEOUT
LINKERD2_PROXY_BUFFER_CAPACITY
LINKERD2_PROXY_INBOUND_ROUTER_MAX_IDLE_AGE
LINKERD2_PROXY_OUTBOUND_ROUTER_MAX_IDLE_AGE
LINKERD2_PROXY_INBOUND_HTTP_QUEUE_CAPACITY
LINKERD2_PROXY_INBOUND_HTTP_FAILFAST_TIMEOUT
LINKERD2_PROXY_OUTBOUND_TCP_QUEUE_CAPACITY
LINKERD2_PROXY_OUTBOUND_TCP_FAILFAST_TIMEOUT
LINKERD2_PROXY_OUTBOUND_HTTP_QUEUE_CAPACITY
LINKERD2_PROXY_OUTBOUND_HTTP_FAILFAST_TIMEOUT
LINKERD2_PROXY_OUTBOUND_HTTP1_CONNECTION_POOL_IDLE_TIMEOUT