Per connection local rate limiting by gokulnair · Pull Request #15843 · envoyproxy/envoy

gokulnair · 2021-04-05T20:38:52Z

Commit Message:
This is a PR for scoping token buckets in the local rate limiting flow on a per connection basis as opposed to scoping it on the entire envoy instance. More details in #15637

Additional Description:
Currently, the HTTP local rate limiter's token bucket is shared across all workers, thus causing the rate limits to be applied per Envoy instance/process. This could potentially result in bad actors quickly exhausting limits on a given envoy instance before legitimate users have had a fair chance. We achieve this by adding an instance of the LocalRateLimit::LocalRateLimiterImpl to each connection object, if there isn't one already, via FilterState data

Risk Level: Low

Testing:
Added unit tests to local_ratelimit
Manually tested via curl'ing against a locally patched envoy instance.
One can send multiple requests on the same connection via curl using the following:
curl -vI example.com example.com

Docs Changes:
Added new toggle to local rate limit configuration to enable per connection local rate limiting

// Specifies the scope of the rate limiter's token bucket. 
// If set to false, the token bucket is shared across all worker threads
// thus the rate limits are applied per Envoy process.
// If set to true, a token bucket is allocated for each connection.
// Thus the rate limits are applied per connection thereby allowing
// one to rate limit requests on a per connection basis.
// If unspecified, the default value is false.
bool local_rate_limit_per_downstream_connection

Sample configuration

typed_config:
  "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit
  stat_prefix: http_local_rate_limiter
  token_bucket:
    max_tokens: 10000
    tokens_per_fill: 1000
    fill_interval: 1s
  filter_enabled:
    runtime_key: local_rate_limit_enabled
    default_value:
      numerator: 100
      denominator: HUNDRED
  filter_enforced:
    runtime_key: local_rate_limit_enforced
    default_value:
      numerator: 100
      denominator: HUNDRED
  response_headers_to_add:
    - append: false
      header:
        key: x-local-rate-limit
        value: 'true'
  local_rate_limit_per_downstream_connection: true

[Optional Fixes #Issue]
#15637

repokitteh-read-only · 2021-04-05T20:38:54Z

Hi @gokulnair, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #15843 was opened by gokulnair.

see: more, trace.

repokitteh-read-only · 2021-04-05T20:38:58Z

CC @envoyproxy/api-shepherds: Your approval is needed for changes made to api/envoy/.
API shepherd assignee is @lizan
CC @envoyproxy/api-watchers: FYI only for changes made to api/envoy/.

🐱

Caused by: #15843 was opened by gokulnair.

see: more, trace.

gokulnair · 2021-04-05T20:42:55Z

cc @mattklein123 @rgs1

rgs1 · 2021-04-06T17:26:31Z

@gokulnair mind addressing the format error:

trailing whitespace: source/extensions/filters/http/local_ratelimit/local_ratelimit.h

I'll review the rest in the meanwhile, thanks!

rgs1

Did a first pass, let's also add docs and tests. Thanks!

rgs1 · 2021-04-06T17:27:05Z

api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto

Let's put a long description here explaining what this knob does, when it might be a good idea to use it, etc, etc.

Would it make sense to make this name a little more descriptive, such as rate_limiter_per_connection maybe?

Yes, I think that would help. I think rate_limiter_per_connection is good, given that we have connection_pool_per_downstream_connection in the cluster msg definition. Alternatively apply_per_connection is a suggestion if you are unconvinced.

Alternatively, per_connection_local_rate_limiter is also good. I think I like that one the most.

source/extensions/filters/http/local_ratelimit/config.cc

source/extensions/filters/http/local_ratelimit/local_ratelimit.h

gokulnair · 2021-05-18T17:55:33Z

Thanks for the reviews! Let me know if there's anything else that's needed to get this going.

alyssawilk

Oops, I'm sorry about review lag - Matt would usually take the lead here but his leave got unexpectedly extended. I'll pick it up instead!

api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto

docs/root/configuration/http/http_filters/local_rate_limit_filter.rst

gokulnair · 2021-05-20T18:25:35Z

Oops, I'm sorry about review lag - Matt would usually take the lead here but his leave got unexpectedly extended. I'll pick it up instead!

Thanks @alyssawilk!

alyssawilk

Looks totally solid! Just a few minor nits, and thanks for your patience both with review lag and me getting a handle on use case :-)

docs/root/configuration/http/http_filters/local_rate_limit_filter.rst

alyssawilk · 2021-05-20T18:30:55Z

test/extensions/filters/http/local_ratelimit/filter_test.cc

  EXPECT_EQ(1U, findCounter("test.http_local_rate_limit.rate_limited"));
 }

+TEST_F(FilterTest, RequestRateLimitedPerConnection) {


WDYT of adding an integration test? I think you can crib off of existing tests in ratelimit_integration_test.cc plus some of the h2 tests to make sure if you stick a limit of one stream, that a second stream on that connection won't (immediately) go upstream but a stream on a separate connection will pass through.

Sure, I can take a look at adding an integration test for the above scenario. The remaining comments should be addressed and good to go.

Ah! After taking a closer look, seems like the tests in ratelimit_integration_test.cc are all targeting the 'rate limiter as an external service' use case and it makes complete sense to have integration tests for that.
For local rate limiting within envoy however, doesn't really look like we have any integration tests and I'm wondering if it buys us anything more than what the unit tests do currently. Thoughts? cc @rgs1

I think integration tests generally have value - I wasn't convinced setting connection data on the per-request StreamInfo would work as expected (there are no currently filters which do this, but I went ahead and tweaked some existing code to verify it worked for my own satiscation =P) but given the review delay this PR already suffered and lack of local rate limit tests to crib off of, I'm inclined to let it go this once though if you want to do a follow-up you'd totally earn brownie points :-)

Brownie points it is then :-) I do have a draft up but I'm fighting the integration test api a bit at the moment, specifically with trying to get it to send multiple requests over the same connection while still ensuring that the first one generates an upstream request over its fake_upstreams_ while the subsequent locally rate limited ones don't. It's holding it up more than necessary so a follow up task is making more sense at this point.

Thanks for taking a look!!

alyssawilk · 2021-05-20T18:36:01Z

source/extensions/filters/http/local_ratelimit/local_ratelimit.cc

+const Filters::Common::LocalRateLimit::LocalRateLimiterImpl& Filter::getRateLimiter() {
+  if (!decoder_callbacks_->streamInfo().filterState()->hasData<PerConnectionRateLimiter>(
+          PerConnectionRateLimiter::key())) {
+


nit: extra whitespace?

alyssawilk · 2021-05-20T18:57:11Z

source/extensions/filters/http/local_ratelimit/local_ratelimit.cc

  }

-  if (config->requestAllowed(descriptors)) {
+  const bool is_request_allowed = config->rateLimitPerConnection()


I'd either put the config->request(allowed) vs getRatelimiter().requestAllowed switch in Filter::requestAllowed or rename requestAllowed to indicate it only applies to per-connection-rate-limited requests.

ditto for getRateLimiter. getConnectionRateLimiter? with an assert that config->rateLimitPerConnection()?

antoniovicente

Looks good, just some minor comments / nits.

antoniovicente · 2021-05-20T18:09:03Z

api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto

+  // If set to true, a token bucket is allocated for each connection.
+  // Thus the rate limits are applied per connection thereby allowing
+  // one to rate limit requests on a per connection basis.
+  bool local_rate_limit_per_downstream_connection = 11;


Is it possible for a proxy to have both per-process and per downstream connection rate limits, or just one or the other?

Another potential question is if we should consider some additional rate limit criteria in the future like downstream IP or HTTP Cookie. If we expect additional criteria in the future, we may want to make this an enum field.

It's definitely one or the other as it stands currently as I'm not sure the added complexity of dealing with potentially conflicting token bucket quotas between the per process and per connection configurations and the precedence rules we'd have to handle buys us much in the way of functionality.

We did indeed consider an enum initially but it didn't seem like there were very many realistic use cases mainly because a lot of the toggles that were based on certain request characteristics such as IP, Cookie etc can be handled today by rate limiting on request descriptors ...

antoniovicente · 2021-05-20T18:16:27Z

source/extensions/filters/http/local_ratelimit/local_ratelimit.cc

-  if (config->requestAllowed(descriptors)) {
+  const bool is_request_allowed = config->rateLimitPerConnection()
+                                      ? requestAllowed(descriptors)
+                                      : config->requestAllowed(descriptors);


Consider moving this branch to the requestAllowed method and call the getRateLimiter() or config requestAllowed method based on config.

It would be fine to pass config as an argument to Filter::requestAllowed

antoniovicente · 2021-05-20T18:18:09Z

source/extensions/filters/http/local_ratelimit/local_ratelimit.cc

+  return getRateLimiter().requestAllowed(request_descriptors);
+}
+
+const Filters::Common::LocalRateLimit::LocalRateLimiterImpl& Filter::getRateLimiter() {


naming nit: getPerConnectionRateLimiter

antoniovicente · 2021-05-20T18:36:20Z

test/extensions/filters/http/local_ratelimit/filter_test.cc

    header:
      key: x-local-ratelimited
      value: 'true'
+local_rate_limit_per_downstream_connection: {}


Seems like the {} is used in substitutions below. It would be good to add an end-of-line comment explaining the {}. I think that the yaml comment character is #

Added an explanation below.

antoniovicente · 2021-05-20T20:30:17Z

test/extensions/filters/http/local_ratelimit/filter_test.cc

+            filter_->decodeHeaders(request_headers, false));
+  EXPECT_EQ(Http::FilterHeadersStatus::Continue, filter_2_->decodeHeaders(request_headers, false));
+  EXPECT_EQ(Http::FilterHeadersStatus::StopIteration,
+            filter_2_->decodeHeaders(request_headers, false));


A comment explaining the differences between this test and FilterTest.RequestRateLimited may be helpful for future readers. I think this shows that the limit is applied by connection by verifying that filter_2_ allows requests after filter_ hits the rate limit.

Added a brief explanation to the test.

gokulnair · 2021-05-21T00:03:18Z

Looks good, just some minor comments / nits.

Thanks!

Signed-off-by: Gokul Nair <gnair@twitter.com>

gokulnair · 2021-05-21T01:11:45Z

I've addressed all comments/nits. Will take a look at adding the integration test shortly.

Signed-off-by: Gokul Nair <gnair@twitter.com>

antoniovicente

Looks good. I'm sure that the integration test you're looking into would make it even better.

Thanks!

antoniovicente · 2021-05-22T00:48:47Z

/retest

repokitteh-read-only · 2021-05-22T00:48:51Z

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #15843 (comment) was created by @antoniovicente.

see: more, trace.

alyssawilk

LGTM

Could you please update the PR description with the appropriate fields (testing: unit tests, docs: inline etc) and then I'll let @antoniovicente merge when he's set.

alyssawilk · 2021-05-24T17:00:30Z

test/extensions/filters/http/local_ratelimit/filter_test.cc

  EXPECT_EQ(1U, findCounter("test.http_local_rate_limit.rate_limited"));
 }

+TEST_F(FilterTest, RequestRateLimitedPerConnection) {


I think integration tests generally have value - I wasn't convinced setting connection data on the per-request StreamInfo would work as expected (there are no currently filters which do this, but I went ahead and tweaked some existing code to verify it worked for my own satiscation =P) but given the review delay this PR already suffered and lack of local rate limit tests to crib off of, I'm inclined to let it go this once though if you want to do a follow-up you'd totally earn brownie points :-)

antoniovicente · 2021-05-24T19:05:49Z

LGTM

Could you please update the PR description with the appropriate fields (testing: unit tests, docs: inline etc) and then I'll let @antoniovicente merge when he's set.

I don't have remaining comments. The remaining question is wherever or not integration tests are in the work as part of this PR or a followup. A followup makes the most sense to me at this point, we should go ahead and merge.

gokulnair · 2021-05-24T19:51:27Z

I don't have remaining comments. The remaining question is wherever or not integration tests are in the work as part of this PR or a followup. A followup makes the most sense to me at this point, we should go ahead and merge.

I do agree that a follow task makes the most sense at this time (I do have a draft up but its still WIP)
Also, I've updated the PR description with more details. Looks like we should be good to go for the merge but let me know if there's anything else that's needed from my end.
Thanks! 👍

gokulnair · 2021-05-24T21:52:22Z

/lgtm api

@htuch Mind approving the api change once again, please? (we updated the doc in the proto file)
Thanks!

htuch

/lgtm api

@type

This is a PR for scoping token buckets in the local rate limiting flow on a per connection basis as opposed to scoping it on the entire envoy instance. More details in envoyproxy#15637 Currently, the HTTP local rate limiter's token bucket is shared across all workers, thus causing the rate limits to be applied per Envoy instance/process. This could potentially result in bad actors quickly exhausting limits on a given envoy instance before legitimate users have had a fair chance. We achieve this by adding an instance of the LocalRateLimit::LocalRateLimiterImpl to each connection object, if there isn't one already, via FilterState data Risk Level: Low Testing: Added unit tests to local_ratelimit Manually tested via curl'ing against a locally patched envoy instance. One can send multiple requests on the same connection via curl using the following: curl -vI example.com example.com Docs Changes: Added new toggle to local rate limit configuration to enable per connection local rate limiting // Specifies the scope of the rate limiter's token bucket. // If set to false, the token bucket is shared across all worker threads // thus the rate limits are applied per Envoy process. // If set to true, a token bucket is allocated for each connection. // Thus the rate limits are applied per connection thereby allowing // one to rate limit requests on a per connection basis. // If unspecified, the default value is false. bool local_rate_limit_per_downstream_connection Sample configuration typed_config: "@type": type.googleapis.com/envoy.extensions.filters.http.local_ratelimit.v3.LocalRateLimit stat_prefix: http_local_rate_limiter token_bucket: max_tokens: 10000 tokens_per_fill: 1000 fill_interval: 1s filter_enabled: runtime_key: local_rate_limit_enabled default_value: numerator: 100 denominator: HUNDRED filter_enforced: runtime_key: local_rate_limit_enforced default_value: numerator: 100 denominator: HUNDRED response_headers_to_add: - append: false header: key: x-local-rate-limit value: 'true' local_rate_limit_per_downstream_connection: true Fixes envoyproxy#15637 Signed-off-by: Gokul Nair <gnair@twitter.com>

gokulnair requested a review from mattklein123 as a code owner April 5, 2021 20:38

repokitteh-read-only bot added the api label Apr 5, 2021

repokitteh-read-only bot assigned lizan Apr 5, 2021

gokulnair marked this pull request as draft April 5, 2021 20:45

mattklein123 assigned rgs1 and mattklein123 and unassigned lizan Apr 6, 2021

gokulnair changed the title ~~WIP: Per connection local rate limiting~~ [WIP] Per connection local rate limiting Apr 6, 2021

rgs1 suggested changes Apr 6, 2021

View reviewed changes

mattklein123 added the waiting label Apr 6, 2021

repokitteh-read-only bot removed the waiting label Apr 6, 2021

mattklein123 added the waiting label Apr 6, 2021

gokulnair force-pushed the perconn_rl_15637 branch from 193f1d5 to 95054bc Compare April 7, 2021 00:08

repokitteh-read-only bot removed the waiting label Apr 7, 2021

mattklein123 added the waiting label Apr 7, 2021

repokitteh-read-only bot removed the waiting label Apr 8, 2021

mattklein123 added the waiting label Apr 8, 2021

repokitteh-read-only bot removed the waiting label Apr 9, 2021

mattklein123 added the waiting label Apr 9, 2021

repokitteh-read-only bot removed the waiting label Apr 13, 2021

mattklein123 added the waiting label Apr 13, 2021

repokitteh-read-only bot removed the waiting label Apr 22, 2021

gokulnair force-pushed the perconn_rl_15637 branch from 5eab75d to e83052e Compare April 23, 2021 23:39

gokulnair marked this pull request as ready for review April 23, 2021 23:44

gokulnair force-pushed the perconn_rl_15637 branch from 7d0fc2b to 23bf090 Compare April 26, 2021 01:24

gokulnair changed the title ~~[WIP] Per connection local rate limiting~~ Per connection local rate limiting Apr 26, 2021

alyssawilk self-assigned this May 20, 2021

alyssawilk reviewed May 20, 2021

View reviewed changes

api/envoy/extensions/filters/http/local_ratelimit/v3/local_rate_limit.proto Show resolved Hide resolved

docs/root/configuration/http/http_filters/local_rate_limit_filter.rst Show resolved Hide resolved

alyssawilk reviewed May 20, 2021

View reviewed changes

alyssawilk added the waiting label May 20, 2021

antoniovicente reviewed May 20, 2021

View reviewed changes

Merge remote-tracking branch 'origin/main' into perconn_rl_15637

a6b15b0

repokitteh-read-only bot added api and removed waiting labels May 20, 2021

docs and cleanup

d994fee

Signed-off-by: Gokul Nair <gnair@twitter.com>

gokulnair dismissed rgs1’s stale review via d994fee May 21, 2021 01:07

api shadow file

b2b2d67

Signed-off-by: Gokul Nair <gnair@twitter.com>

antoniovicente approved these changes May 21, 2021

View reviewed changes

rgs1 mentioned this pull request May 24, 2021

unclear behavior of max_requests in cluster circuit breakers #16610

Closed

alyssawilk approved these changes May 24, 2021

View reviewed changes

gokulnair closed this May 24, 2021

gokulnair reopened this May 24, 2021

htuch reviewed May 25, 2021

View reviewed changes

repokitteh-read-only bot removed the api label May 25, 2021

htuch merged commit 151aa0c into envoyproxy:main May 25, 2021

gokulnair mentioned this pull request May 25, 2021

Integration tests for local rate limit http filter #16666

Merged

Conversation

gokulnair commented Apr 5, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

repokitteh-read-only bot commented Apr 5, 2021

Uh oh!

repokitteh-read-only bot commented Apr 5, 2021

Uh oh!

gokulnair commented Apr 5, 2021

Uh oh!

rgs1 commented Apr 6, 2021

Uh oh!

rgs1 left a comment

Choose a reason for hiding this comment

Uh oh!

rgs1 Apr 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

gokulnair commented May 18, 2021

Uh oh!

alyssawilk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

gokulnair commented May 20, 2021

Uh oh!

alyssawilk left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

antoniovicente left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gokulnair May 20, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gokulnair May 21, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gokulnair commented May 21, 2021

Uh oh!

gokulnair commented Apr 5, 2021 •

edited

Loading

rgs1 Apr 6, 2021 •

edited

Loading

gokulnair May 20, 2021 •

edited

Loading

gokulnair May 21, 2021 •

edited

Loading

gokulnair commented May 24, 2021 •

edited

Loading

gokulnair commented May 24, 2021 •

edited

Loading