Add SO_KEEPALIVE option for upstream connections.#614
Add SO_KEEPALIVE option for upstream connections.#614JonathanO wants to merge 6 commits intoenvoyproxy:masterfrom
Conversation
Config changes to support envoyproxy/envoy#3028 Signed-off-by: Jonathan Oddy <jonathan.oddy@transferwise.com>
d32fefb to
9d38f06
Compare
envoy/api/v2/core/address.proto
Outdated
| google.protobuf.BoolValue freebind = 2; | ||
|
|
||
| // If set then set SO_KEEPALIVE on the socket to enable TCP Keepalives. | ||
| TcpKeepalive tcp_keepalive = 3; |
There was a problem hiding this comment.
Is this tied to the prebind configuration? Or can it be set at any time (e.g. post bind, after listen(), etc)?
There was a problem hiding this comment.
My current implementation is only applicable to upstream connections, and is therefore prebind since ClientConnectionImpl only invokes PreBind.
Making it usable for inbound connections will be a bit more work, since I believe the options would need to be set on the newly accept()ed socket rather than the listen() socket, and there's no existing pattern for doing that.
There was a problem hiding this comment.
I still have a uncomfortable feeling about this grouping. So far, BindConfig is all the socket address binding config, this is more of a general connection property. Can we make this a new ConnectionOptions message that belongs to Cluster?
There was a problem hiding this comment.
FWIW, I'm going to eventually add keepalive settings on listeners too. So having these in a separate message makes sense to me. But we could put this message into a ConnectionOptions.
mattklein123
left a comment
There was a problem hiding this comment.
LGTM other than some small nits and @htuch question.
docs/root/intro/version_history.rst
Outdated
| * tracing: the sampling decision is now delegated to the tracers, allowing the tracer to decide when and if | ||
| to use it. For example, if the :ref:`x-b3-sampled <config_http_conn_man_headers_x-b3-sampled>` header | ||
| is supplied with the client request, its value will override any sampling decision made by the Envoy proxy. | ||
| * sockets: added `SO_KEEPALIVE` socket option for upstream connections via :ref:`cluster manager wide |
envoy/api/v2/core/address.proto
Outdated
| message TcpKeepalive { | ||
| // Maximum number of keepalive probes to send without response before deciding | ||
| // the connection is dead. Default is to use the OS level configuration (unless | ||
| // overridden, Linux defaults to 9.) |
Signed-off-by: Jonathan Oddy <jonathan.oddy@transferwise.com>
Allows a cluster to disable keepalive that was enabled by default on the cluster manager. Signed-off-by: Jonathan Oddy <jonathan.oddy@transferwise.com>
e20c195 to
b60ba8a
Compare
Adds support for configuring TCP Keepalives on upstream connections. Resolves envoyproxy#3028 Requires envoyproxy/data-plane-api#614 Signed-off-by: Jonathan Oddy <jonathan.oddy@transferwise.com>
envoy/api/v2/core/address.proto
Outdated
| google.protobuf.UInt32Value keepalive_interval = 3; | ||
| // Flag to explicitly disable keepalive probes, allows overriding of settings | ||
| // inherited by a cluster from the cluster manager. Default is false. | ||
| google.protobuf.BoolValue disable = 4; |
There was a problem hiding this comment.
This should be just bool as it default false.
As per code review comments, this doesn't belong with BindConfig. Rather than introduce ConnectionOptions I decided to split it into a specific UpstreamConnectionOptions as I suspect that, other than keepalive, there will be little commonality between listen and upstream connection options in the future. I didn't really know where to put the TcpKeepalive message though, so left it in address.pb for now. It will be common to listen and upstream connection options. Changed disabled flag to bool type. Signed-off-by: Jonathan Oddy <jonathan.oddy@transferwise.com>
Signed-off-by: Jonathan Oddy <jonathan.oddy@transferwise.com>
| core.Address source_address = 1; | ||
| } | ||
|
|
||
| message UpstreamConnectionOptions { |
There was a problem hiding this comment.
LGTM, but I'd rename the message to just ConnectionOptions, since we might want to reuse this elsewhere, e.g. health check connections etc.
There was a problem hiding this comment.
I think the health checks will inherit these options from the cluster already in my implementation.
I didn't want to call it ConnectionOptions, as I suspect we'll want a similar, but not identical, message for accept()ed connections.
Perhaps ClusterConnectionOptions?
|
|
||
| // Optional options for upstream connections. | ||
| // This may be overridden on a per-cluster basis by upstream_connection_options in the cds_config. | ||
| envoy.api.v2.UpstreamConnectionOptions upstream_connection_options = 5; |
There was a problem hiding this comment.
It is preferable to omit this. Envoy doesn't in general do hierarchical config, in the sense of setting an option at a higher level and overriding later.
This has been handled inconsistently (for instance bind config). But unless there's a compelling reason, please remove this (and "bool disable = 4" from the TcpKeepalive message).
There was a problem hiding this comment.
OK. I don't have a particularly good excuse for wanting this, and removing it will simplify the code a little, so I'll remove it.
Signed-off-by: Jonathan Oddy <jonathan.oddy@transferwise.com>
|
@JonathanO we're going to have to continue this in envoyproxy/envoy post envoyproxy/envoy#2934, which should land today. The upside is that this can be rolled into your existing PR there. |
|
Can you please merge this with envoyproxy/envoy#3042 as per https://groups.google.com/forum/?utm_medium=email&utm_source=footer#!topic/envoy-dev/KcVHFH-zQwQ? Thanks! |
Adds TCP Keepalive support for upstream connections. This can be configured on the cluster manager level, and overridden on the cluster level. Risk Level: Medium Testing: Unit tests have been added. It appears to run and work. Docs Changes: envoyproxy/data-plane-api#614 Fixes #3028 API Changes: envoyproxy/data-plane-api#614 Signed-off-by: Jonathan Oddy <jonathan.oddy@transferwise.com>
Adds TCP Keepalive support for upstream connections. This can be configured on the cluster manager level, and overridden on the cluster level. Risk Level: Medium Testing: Unit tests have been added. It appears to run and work. Docs Changes: #614 Fixes envoyproxy/envoy#3028 API Changes: #614 Signed-off-by: Jonathan Oddy <jonathan.oddy@transferwise.com> Mirrored from https://github.com/envoyproxy/envoy @ dd953f99945bb7c6b3251f71bffe252a5f6e9e62
…#3042) Adds TCP Keepalive support for upstream connections. This can be configured on the cluster manager level, and overridden on the cluster level. Risk Level: Medium Testing: Unit tests have been added. It appears to run and work. Docs Changes: envoyproxy/data-plane-api#614 Fixes envoyproxy#3028 API Changes: envoyproxy/data-plane-api#614 Signed-off-by: Jonathan Oddy <jonathan.oddy@transferwise.com> Signed-off-by: Rama <rama.rao@salesforce.com>
Config changes to support envoyproxy/envoy#3028
Signed-off-by: Jonathan Oddy jonathan.oddy@transferwise.com