Skip to content

redis: upstream client draining and temp host connection limit and cl…#7104

Merged
mattklein123 merged 18 commits intoenvoyproxy:masterfrom
HenryYYang:msukalski/redirection-hosts
Jul 13, 2019
Merged

redis: upstream client draining and temp host connection limit and cl…#7104
mattklein123 merged 18 commits intoenvoyproxy:masterfrom
HenryYYang:msukalski/redirection-hosts

Conversation

@msukalski
Copy link
Contributor

…eanup

Signed-off-by: Mitch Sukalski mitch.sukalski@workday.com

For an explanation of how to fill out the fields, please see the relevant section
in PULL_REQUESTS.md

Description:

Upstream connection draining has been implemented so cluster topology changes do not cause unnecessary errors for in-flight redis requests. A new connection pool setting, upstream_drain_poll_interval (default 100ms), has been created to control the polling interval to check if draining connections are still active once they've been put aside to drain. Another connection pool setting, max_upstream_unknown_connections (default 100), controls how many connections a worker thread will open to upstream servers that it learns about from redis redirection errors. When hosts are discovered dynamically (e.g., RedisCluster), connections created via redirection are put aside to drain.

Risk Level: low
Testing: unit tests
Docs Changes: new connection pool settings autogenerated from redis_proxy.proto comments
Release Notes:
[Optional Fixes #Issue]
[Optional Deprecated:]

…eanup

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
@msukalski
Copy link
Contributor Author

I'm going to add 2 more tests to get full coverage.

…voyproxy#7104)

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
@mattklein123
Copy link
Member

@HenryYYang for first pass. cc @envoyproxy/redis-dev

@stale
Copy link

stale bot commented Jun 6, 2019

This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions!

@stale stale bot added the stale stalebot believes this issue/PR has not been touched recently label Jun 6, 2019
@msukalski
Copy link
Contributor Author

@HenryYYang this PR has gone stale, help!

@stale stale bot removed the stale stalebot believes this issue/PR has not been touched recently label Jun 6, 2019
Copy link
Contributor

@HenryYYang HenryYYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for fixing this, a few questions.

Copy link
Contributor

@HenryYYang HenryYYang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some small nits and adding stats so that we can track it. Then I think we're good.

…voyproxy#7104)

- review feedback changes

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
…tion-hosts

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
…voyproxy#7104)

- fixing tests

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
…voyproxy#7104)

- stats integration test change

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
@msukalski
Copy link
Contributor Author

tsan check failed due to some sort of network problem or crash -- I'll try again later with the latest upstream code merged...

…voyproxy#7104)

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
@msukalski
Copy link
Contributor Author

@HenryYYang changes up and all CI checks passed...

@HenryYYang
Copy link
Contributor

LGTM

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, some high level questions to get started with before I dig into the code in detail. (Also needs a master merge.)

/wait-any

@msukalski
Copy link
Contributor Author

@mattklein123 I'll work on the changes, especially RedisCluster specific statistics, and get them up when I figure it out. Thanks!

…tion-hosts

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
@msukalski
Copy link
Contributor Author

ipv6_tests failed due to download failure of six_archive python package; tsan failed due to 1 failing test -- //test/extensions/transport_sockets/tls:ssl_socket_test. I'll have to look into the tsan test failure, but this is not code that I've modified...

@msukalski
Copy link
Contributor Author

envoy-macos build failed with error: The agent: Hosted macOS 3 lost communication with the server. Verify the machine is running and has a healthy network connection.

//test/extensions/transport_sockets/tls:ssl_socket_test passes locally for me in 22 seconds...

…voyproxy#7104)

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
@msukalski
Copy link
Contributor Author

coverage failed due to g++ internal compiler error compiling test/integration/protocol_integration_test.cc ... will try again with the same code since this all passed earlier yesterday

…voyproxy#7104)

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
@msukalski
Copy link
Contributor Author

now clang-tidy is failing with

ERROR: An error occurred during the fetch of repository 'com_github_pallets_markupsafe':
java.io.IOException: Error downloading [https://github.com/pallets/markupsafe/archive/1.1.1.tar.gz] to /build/tmp/_bazel_bazel/b570b5ccd0454dc9af9f65ab1833764d/external/com_github_pallets_markupsafe/1.1.1.tar.gz: Read timed out

This CI environment is incredibly unstable, and now infuriating. I've spent most of 2 days trying to get a clean CI run with no code changes needed. It's just been a parade of build problems...

@msukalski
Copy link
Contributor Author

@mattklein123 it doesn't look like it's going to be easy to get a clean CI check. I'm not going to try again until you've had a chance for further review. It's a waste of my time, if I'm going to have to make more changes. Thanks!

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM with small comment. Thank you!

/wait

…voyproxy#7104)

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
…voyproxy#7104)

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks with one small test comment.

/wait

@repokitteh-read-only
Copy link

🙀 Error while processing event:

evaluation error
error: cannot load github.com/repokitteh/modules/lib/circleci.star: load("github.com/repokitteh/modules/lib/circleci.star") error: module load error
🐱

Caused by: a #7104 (review) was submitted by @mattklein123.

see: more, trace.

@msukalski
Copy link
Contributor Author

I'll add a test of the new statistic to the MakeRequestToHostWithZeroMaxUnknownUpstreamConnectionLimit test in conn_pool_impl_test.cc

…voyproxy#7104)

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
@mattklein123
Copy link
Member

@msukalski can you also merge master? There are some CI flake fixes, etc. Thanks.

/wait

…tion-hosts

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
@msukalski
Copy link
Contributor Author

I've merged the latest upstream master. compile_time_options and tsan have both failed with build "connect timed out" errors pulling down external packages....sigh

…voyproxy#7104)

Signed-off-by: Mitch Sukalski <mitch.sukalski@workday.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@mattklein123 mattklein123 merged commit 80a4ed7 into envoyproxy:master Jul 13, 2019
@msukalski msukalski deleted the msukalski/redirection-hosts branch July 15, 2019 01:50
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants