tcp_proxy: wait for CONNECT response before start streaming data by irozzo-1A · Pull Request #14317 · envoyproxy/envoy

irozzo-1A · 2020-12-08T09:00:40Z

For an explanation of how to fill out the fields, please see the relevant section
in PULL_REQUESTS.md

Commit Message: The TCP proxy used in tunneling mode now waits for the CONNECT response before start sending the data. Before the data was transmitted right after the CONNECT request.
Additional Description:
Risk Level: medium
Testing: unit test, integration, manual test
Docs Changes: https://github.com/irozzo-1A/envoy/blob/wait-for-connect-response/docs/root/intro/arch_overview/http/upgrades.rst#tunneling-tcp-over-http
Release Notes: https://github.com/irozzo-1A/envoy/blob/wait-for-connect-response/docs/root/version_history/current.rst#minor-behavior-changes
Platform Specific Features: NONE
Runtime guard: http_upstream_wait_connect_response

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

irozzo-1A · 2020-12-08T09:02:33Z

Refers to #13293 (comment)

dio · 2020-12-08T10:52:24Z

cc. @wez470 @alyssawilk

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

irozzo-1A · 2020-12-08T14:36:15Z

The solution proposed here has to be considered as a base for discussion. I think this is not a good approach, but I'm still looking for a solution that does not require too much refactoring. Any idea is welcome.

alyssawilk · 2020-12-14T20:48:02Z

yeah, stashing the params (maybe in a wrapper struct) seems like the only thing I can think of either.

irozzo-1A · 2020-12-14T21:25:59Z

yeah, stashing the params (maybe in a wrapper struct) seems like the only thing I can think of either.

ok, I'll proceed this way then ;-)

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

alyssawilk

This looks great - I love how self contained it ended up being.

source/common/tcp_proxy/upstream.h

alyssawilk · 2020-12-17T17:30:14Z

test/integration/tcp_tunneling_integration_test.cc

+  // Wait a bit, no data should go through.
+  ASSERT_FALSE(upstream_request_->waitForData(*dispatcher_, 1, std::chrono::milliseconds(100)));
+
+  upstream_request_->encodeHeaders(default_response_headers_, false);


I think we could have a test for failure modes too - either a disconnect and/or not 200-ok headers.

Added test for non 200 response

alyssawilk · 2020-12-17T17:35:07Z

source/common/tcp_proxy/upstream.cc

@@ -119,6 +119,7 @@ void HttpUpstream::resetEncoder(Network::ConnectionEvent event, bool inform_down
  if (inform_downstream) {
    upstream_callbacks_.onEvent(event);


can we end up having an event callback "from the upstream connection" before there is a connection established. That's a bit weird. I think it might be cleaner if inform_downstream is true, that if we have a deferer we do pool failure, and if not we do the onEvent. WDYT?

irozzo-1A · 2020-12-17T18:00:29Z

This looks great - I love how self contained it ended up being.

Thx @alyssawilk, I'm trying to avoid having the cyclical ownership issue with:

https://github.com/irozzo-1A/envoy/blob/d867f50fcf75e5c6efc15826e2c9e7534680e21f/source/common/tcp_proxy/upstream.h#L178-L180
https://github.com/irozzo-1A/envoy/blob/d867f50fcf75e5c6efc15826e2c9e7534680e21f/source/common/tcp_proxy/upstream.h#L93

After that, I'll address your remarks ;-)

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

alyssawilk · 2020-12-18T14:42:05Z

source/common/tcp_proxy/upstream.h

@@ -129,6 +157,9 @@ class HttpUpstream : public GenericUpstream, protected Http::StreamCallbacks {
    void decodeHeaders(Http::ResponseHeaderMapPtr&& headers, bool end_stream) override {
      if (!parent_.isValidResponse(*headers) || end_stream) {
        parent_.resetEncoder(Network::ConnectionEvent::LocalClose);


I think we still have a weird corner case where we resetEncoder where if "inform_downstream" is true, we send an upstream event downstream before the pool knows it has an upstream associated.

I think we could simplify this by dropping onGenericPoolFailure below, and instead in resetEncoder,
if(inform_downstream) {
if (deferrer) {
deferrer.onGenericPoolFailure()
} else {
former logic.
}
}
I think we always reset the encoder on failure (be it decodeHeaders failing, or disconnect) and then resetEncoder would take care of making it look like an event, or like a pool failure. WDYT? It's definitely worth an integration test and/or unit tests since it's tricky timing.

yep, I saw your previous comment and I was still figuring out how to do it. I pushed what I did so far, not tested yet. If you have the time to take a look let me know if I'm on the good path ;-)

alyssawilk

Almost there, and thanks for adding the new test!

I'm on vacation starting today through the new year, so I'm going to pass this off to snow/Matt. I hopefully explained the corner case enough they can help you for the few days until the envoy project goes 'on vacation' but if not I'll pick it up when I get back!

alyssawilk · 2020-12-18T14:43:49Z

oh, and given the change to the data plane, we should probably runtime guard this (see CONTRIBUTING.md). Sorry, should have called that out before.

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

irozzo-1A · 2020-12-18T15:11:23Z

I'm on vacation starting today through the new year, so I'm going to pass this off to snow/Matt. I hopefully explained the corner case enough they can help you for the few days until the envoy project goes 'on vacation' but if not I'll pick it up when I get back!

Thx for the info @alyssawilk, enjoy your vacation ;-)

oh, and given the change to the data plane, we should probably runtime guard this (see CONTRIBUTING.md). Sorry, should have called that out before.

No worries, I'll take a look at this.

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

…t-for-connect-response Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

…t-for-connect-response Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

irozzo-1A · 2021-01-05T15:01:39Z

@alyssawilk I think I addressed your points. At the moment the feature flag is enabled by default, let me know if you prefer the opposite.

alyssawilk

Looks great! Just a few more thoughts and I think you're good to go.

alyssawilk · 2021-01-05T18:57:26Z

docs/root/intro/arch_overview/http/upgrades.rst

 CONNECT request, and a second one listening on 10001, stripping the CONNECT headers, and forwarding the 
 original TCP upstream, in this case to google.com.
+
+When runtime flag ``envoy.reloadable_features.http_upstream_wait_connect_response`` is set to ``true``, Envoy waits for


I'd suggest removing mentions of the flag and just saying that Envoy will wait.

alyssawilk · 2021-01-05T18:58:21Z

docs/root/version_history/current.rst

s/wait/now waits/
the envoy. -> the runtime guard envoy.

alyssawilk · 2021-01-05T19:24:14Z

test/integration/tcp_tunneling_integration_test.cc

+  ASSERT_TRUE(upstream_request_->waitForHeadersComplete());
+
+  // Wait a bit, no data should go through.
+  ASSERT_FALSE(upstream_request_->waitForData(*dispatcher_, 1, std::chrono::milliseconds(100)));


is it possible to move this down, to ensure no data is read before or after the bad headers?

Also I'm surprised this passes given you're running it with and without your change. Shouldn't it fail without your change?

is it possible to move this down, to ensure no data is read before or after the bad headers?

Sure, makes sense.

Also I'm surprised this passes given you're running it with and without your change. Shouldn't it fail without your change?

Actually, I'm running it with my change only:

https://github.com/irozzo-1A/envoy/blob/ae297ffbf644a4ceeea3705787d9d379134754a1/test/integration/tcp_tunneling_integration_test.cc#L858-L860

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

alyssawilk

Awesome! Thanks for your patience getting this landed!

@snowp you up for second pass or should I toss it over to @mattklein123?

mattklein123 · 2021-01-06T18:20:50Z

I can take a look since Snow is sick.

mattklein123

From a code perspective this LGTM. I assume that this has no effect on buffering/watermarking or timeouts because the code already had to correctly handle buffering/watermarking and timeouts while waiting for the pool to be ready? Also, do we have any observability around streams that are connected but waiting for CONNECT? It's not explicitly required but something we might want in the future. cc @alyssawilk

alyssawilk · 2021-01-07T15:30:33Z

Yeah, watermarking already connects the upstream (TCP or HTTP) watermarks to downstream, and vice versa.

Timeouts are mildly interesting because the connection looks established from the connection pool perspective but the TCP proxy session doesn't know the upstream connection is assigned. Given we handle the appropriate (wait for upstream connect) timeout in the connection pool, I think the only arguably error is if you had say a 5s keepalive timeout, and you got a 200-Ok in 3s and data in 3s you'd time out where it shouldn't. I think this is orthogonal of this change, but if we cared we could fix by sending an empty date frame or some other ping to tickle the keepalive when the 200 came in.

mattklein123 · 2021-01-07T17:27:23Z

Timeouts are mildly interesting because the connection looks established from the connection pool perspective but the TCP proxy session doesn't know the upstream connection is assigned. Given we handle the appropriate (wait for upstream connect) timeout in the connection pool, I think the only arguably error is if you had say a 5s keepalive timeout, and you got a 200-Ok in 3s and data in 3s you'd time out where it shouldn't. I think this is orthogonal of this change, but if we cared we could fix by sending an empty date frame or some other ping to tickle the keepalive when the 200 came in.

Yeah I think this is fine. I just wanted to make sure we don't somehow lose the timeout entirely.

* master: (48 commits) Resolve 14506, avoid libidn2 for our curl dependency (envoyproxy#14601) fix new/free mismatch in Mainthread utility (envoyproxy#14596) opencensus: deprecate Zipkin configuration. (envoyproxy#14576) upstream: clean up code location (envoyproxy#14580) configuration impl: add cast for ios compilation (envoyproxy#14590) buffer impl: add cast for android compilation (envoyproxy#14589) ratelimit: add dynamic metadata to ratelimit response (envoyproxy#14508) tcp_proxy: wait for CONNECT response before start streaming data (envoyproxy#14317) stream info: cleanup address handling (envoyproxy#14432) [deps] update upb to latest commit (envoyproxy#14582) Add utility to check whether the execution is in main thread. (envoyproxy#14457) listener: undeprecate bind_to_port (envoyproxy#14480) Fix data race in overload integration test (envoyproxy#14586) deps: update PGV (envoyproxy#14571) dependencies: update cve_scan.py for some libcurl 7.74.0 false positives. (envoyproxy#14572) Network::Connection: Add L4 crash dumping support (envoyproxy#14509) ssl: remember stat names for configured ciphers. (envoyproxy#14534) formatter: add custom date formatting to downstream cert start and end dates (envoyproxy#14502) feat(lua): allow setting response body when the upstream response body is empty (envoyproxy#14486) Generalize the gRPC access logger base classes (envoyproxy#14469) ... Signed-off-by: Michael Puncel <mpuncel@squareup.com>

irozzo-1A added 2 commits December 8, 2020 09:28

Defer onGenericPoolReady to the acknowlegement of the CONNECT request

01e9a78

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Fix formatting

e84b3fb

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Undo unwanted format changes

c0b859d

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

alyssawilk self-assigned this Dec 8, 2020

alyssawilk added the waiting label Dec 14, 2020

irozzo-1A added 2 commits December 17, 2020 11:08

Use struct to defer onGenericPoolReady call and add integration test

f14610d

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Fix format

d867f50

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

repokitteh-read-only bot removed the waiting label Dec 17, 2020

alyssawilk reviewed Dec 17, 2020

View reviewed changes

irozzo-1A added 2 commits December 17, 2020 19:14

Avoid the circular ownership

0d26656

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Clean-up fix some review comments

267f633

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

alyssawilk reviewed Dec 18, 2020

View reviewed changes

alyssawilk assigned snowp Dec 18, 2020

Call onGenericPool failure on resetEncoder

954b057

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

irozzo-1A added 5 commits December 18, 2020 19:49

Fix onGenericPoolFailure logic

a181d5f

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Add unit tests

83887e0

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Add runtime guard

6caf912

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Merge branch 'master' of https://github.com/envoyproxy/envoy into wai…

342eafa

…t-for-connect-response Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Fix build

6b016d2

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

irozzo-1A changed the title ~~[WIP] Wait for CONNECT response before start streaming data~~ tcp_proxy: wait for CONNECT response before start streaming data Jan 5, 2021

Document runtime flag

e3345ec

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

irozzo-1A added 2 commits January 5, 2021 11:37

Merge branch 'master' of https://github.com/envoyproxy/envoy into wai…

55e8de6

…t-for-connect-response Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Add release notes

3a47ea4

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

alyssawilk reviewed Jan 5, 2021

View reviewed changes

irozzo-1A added 2 commits January 5, 2021 23:16

Address comments

ae297ff

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

Clean-up docs

ba06231

Signed-off-by: Iacopo Rozzo <iacopo@kubermatic.com>

alyssawilk approved these changes Jan 6, 2021

View reviewed changes

mattklein123 assigned mattklein123 and unassigned snowp Jan 6, 2021

mattklein123 approved these changes Jan 7, 2021

View reviewed changes

mattklein123 merged commit 395ca99 into envoyproxy:master Jan 7, 2021

mattklein123 mentioned this pull request Jul 14, 2021

envoy.reloadable_features.http_upstream_wait_connect_response deprecation #17325

Closed

		@@ -119,6 +119,7 @@ void HttpUpstream::resetEncoder(Network::ConnectionEvent event, bool inform_down
		if (inform_downstream) {
		upstream_callbacks_.onEvent(event);

Conversation

irozzo-1A commented Dec 8, 2020 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

irozzo-1A commented Dec 8, 2020

Uh oh!

dio commented Dec 8, 2020

Uh oh!

irozzo-1A commented Dec 8, 2020

Uh oh!

alyssawilk commented Dec 14, 2020

Uh oh!

irozzo-1A commented Dec 14, 2020

Uh oh!

alyssawilk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

irozzo-1A commented Dec 17, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alyssawilk left a comment

Choose a reason for hiding this comment

Uh oh!

alyssawilk commented Dec 18, 2020

Uh oh!

irozzo-1A commented Dec 18, 2020

Uh oh!

irozzo-1A commented Jan 5, 2021

Uh oh!

alyssawilk left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

alyssawilk left a comment

Choose a reason for hiding this comment

Uh oh!

mattklein123 commented Jan 6, 2021

Uh oh!

mattklein123 left a comment

Choose a reason for hiding this comment

Uh oh!

alyssawilk commented Jan 7, 2021

Uh oh!

mattklein123 commented Jan 7, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

irozzo-1A commented Dec 8, 2020 •

edited

Loading