Skip to content

Linear ramping and probabilistic ramping#218

Merged
mum4k merged 44 commits intoenvoyproxy:masterfrom
oschaaf:ramping-linear-rate-limiter
Jan 3, 2020
Merged

Linear ramping and probabilistic ramping#218
mum4k merged 44 commits intoenvoyproxy:masterfrom
oschaaf:ramping-linear-rate-limiter

Conversation

@oschaaf
Copy link
Copy Markdown
Member

@oschaaf oschaaf commented Nov 27, 2019

Contains new RateLimiter features:

  • A linear rate limiter, which ramps up to a frequency over a configurable period.
    Precise and relatively easy to reason about
  • A probabilistic rate limiter filter, which gradually opens up over a configurable period.
    Useful for ramping up when wrapping an underlying rate limiter which is non-deterministic,
    and possibly in distributed scenarios as well to add some entropy / jitter.

This needs a bit of polish, but ideally we discuss this early so we can cherry pick what we like
from this and possibly discard what we don't.

TODO:

In a follow up we can wire up options for these new rate limiters.
(May also be done in a follow-up PR instead).

In conjunction with the upcoming Phases this would allow implementing a better warmup phase, which ramps up to the targeted frequency.

Groundwork for #31

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Prerequisite to envoyproxy#217

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf oschaaf changed the title [DRAFT] Ramping linear rate limiter [DRAFT] Rate limiter enhancements & features Dec 6, 2019
…te-limiter

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
oschaaf added a commit to oschaaf/nighthawk that referenced this pull request Dec 6, 2019
Modulo an improved test for the batching rate limiter,
this pulls off some refactoring for upcoming new rate
limiters.

Split out off from draft PR  envoyproxy#218

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
…te-limiter

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
htuch pushed a commit that referenced this pull request Dec 12, 2019
Modulo an improved test for the batching rate limiter,
this pulls off some refactoring for upcoming new rate
limiters.

This has no functional changes, and is an intermediate step for draft PR #218,
which contains new rate limiters (linear & probabilistic ramping, first stab at a zipf distribution support).

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
…te-limiter

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
…-linear-rate-limiter

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Copy link
Copy Markdown
Collaborator

@mum4k mum4k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Partial review to start the discussion.

class ZipfRateLimiterImpl : public FilteringRateLimiterImpl {
public:
/**
* From the absl header associated to the zipf distribution:
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also try to say what q and v are? Either something like "The parameters v and q determine the skew of the distribution." (also copied from the header).

Or copy the relevant portion from the zipf_distribution class'es comment:

zipf_distribution produces random integer-values in the range [0, k], distributed according to the discrete probability function:

P(x) = (v + x) ^ -q

* If NDEBUG is defined and either or both of these parameters take invalid
* values, the behavior of the class is undefined.
*/
ZipfRateLimiterImpl(RateLimiterPtr&& rate_limiter, bool deterministic, double q = 2.0,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(optional) Not sure what we prefer in this codebase. I am generally wary of boolean "flag" arguments since they produce client code that is hard to read.

E.g:
ZipfRateLimiterImpl(rate_limiter, true, v, q) // what does true mean?

Of course the client can name it by creating a local variable. We could consider shaping the API in a way that unreadable usage won't be possible, say by defining a well named enum instead.

E.g.:

enum class ZipfBehavior { ZIPF_DETERMINISTIC, ZIPF_NON_DETERMINISTIC };

// Or maybe a bit more transparent:
enum class ZipfBehavior { ZIPF_PSEUDO_RANDOM, ZIPF_RANDOM };

WDYT?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like that; +1

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should have a place to put code-level guidance like this?

/**
* Thin wrapper around absl::zipf_distribution that will pull zeroes and ones from the distribution
* with the intent to probabilistically suppress the wrapped rate limiter.
* This may need further consideration, because it will shoot holes in the pacing, lowering the
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you have any pre-existing discussions on this with @htuch, is this going to be an issue considering the use cases of this rate limiter? What are the use cases of this rate limiter?

@mum4k mum4k added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Dec 23, 2019
…te-limiter

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf oschaaf changed the title Linear ramping, probabilistic ramping, Zipf distributing Linear ramping and probabilistic ramping Dec 23, 2019
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf oschaaf added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Dec 23, 2019
- Back out 1 ns adjustment
- Update test to use microsecond step resolution now that we can
- Add comment with explanation

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Copy link
Copy Markdown
Collaborator

@mum4k mum4k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Complete review now.

};

/**
* A rate limiter which linearly ramps up to the desired frequency over the specified period.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(nit) Would it make sense to sync the terminology? The comment refers to "the specified period" while the argument is called ramp_time.

100.0 - (static_cast<double>(ramp_time_.count() - elapsed().count()) /
(ramp_time_.count() * 1.0)) *
100.0;
return std::round(provider_->getValue() / 10000.0) <= chance_percentage;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am trying to understand the fixed division by 10k. It seems to imply some expectations about the discrete numeric distribution provider that don't seem to be specified on the API.

This may be related to my comment that we should add more documentation next to the constructor for the GraduallyOpeningRateLimiterFilter that will explain the arguments and their roles.

}

GraduallyOpeningRateLimiterFilter::GraduallyOpeningRateLimiterFilter(
const std::chrono::nanoseconds ramp_time, DiscreteNumericDistributionSamplerPtr&& provider,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we validate some of the arguments?

// and the rate limiter computes at nanosecond precision internally. As we
// want to have microsecond level precision, this should be more then sufficient.
EXPECT_EQ(getAcquisitionTimings(5_Hz, 5s),
std::vector<int64_t>({1000050, 1732100, 2236100, 2645800, 3000000, 3316650, 3605600,
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tests apart from testing are often times used by the readers to understand what the code does. (i.e. as additional documentation). Feels like the values that we chose for these tests don't really help the reader. Additionally the unit of these values isn't obvious since the last change to micro-seconds.

I am wondering if we could choose different parameters than 5Hz over 5sec or maybe add test cases with other parameters that will be easier / more obvious for the reader to comprehend. We don't want to decrease test coverage, but we should aim towards self-documenting and readable test cases. Can we try to find a balance?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mum4k I couldn't really obtain easy-to-grok numbers by picking different parameter values here; but how about this: 9d248f2 ?

Instead of matching expectations against hard-coded numbers, that uses the second law of motion to figure out control acquisition timings for comparison, and adds a little slack to when verifying expectations with a comment explaining the need for that. I'm hoping that will help future readers of this code. wdyt, will this be better?

(As a side-effect, this approach also allowed adding tests with higher frequencies / duration arguments)

return rate_limiter_->tryAcquireOne() ? filter_() : false;
}

GraduallyOpeningRateLimiterFilter::GraduallyOpeningRateLimiterFilter(
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(fyi) Now that I read the code and understand the implementation here, I have to say that this is truly beautiful, including your use of class inheritance for the ForwardingRateLimiter.

I have learned something today and I appreciate it, thanks.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is nice to hear, thanks!

@mum4k mum4k added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Dec 24, 2019
- Check expectations on the distribution argument to
GraduallyOpeningRateLimiterFilterTest
- Clean up the computation in GraduallyOpeningRateLimiterFilter
- Add a bunch of explanatory comments

What's left is getting easy to grok numbers in test expectations.

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Instead of matching hard-coded expectations for release timings, use the second
law of motion as a control method for verifying the acquisition timings
the rate limiter.

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf
Copy link
Copy Markdown
Member Author

oschaaf commented Dec 29, 2019

@mum4k I think I addressed all comments, so marking this as ready-for-review. Thanks for the review!

@oschaaf oschaaf added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Dec 29, 2019
public:
virtual ~DiscreteNumericDistributionSampler() = default;
virtual uint64_t getValue() PURE;
virtual uint64_t min() const PURE;
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document the new methods?

(optional / unrelated) Since we are here, can we document the entire interface?

1.0 - static_cast<double>(ramp_time_.count() - elapsed_time.count()) /
(ramp_time_.count() * 1.0);
// Get a random number r, where 0 < r ≤ 1.
const double random_between_0_and_1 = 1.0 * provider_->getValue() / provider_->max();
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, this is easier to follow with the new comments and the min() and max() methods.

public:
std::vector<int64_t> getAcquisitionTimings(const Frequency frequency,
const std::chrono::seconds duration) {
void checkAcquisitionTimings(const Frequency frequency, const std::chrono::seconds duration) {
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document this helper? Specifically its arguments.

3873000, 4123150, 4358900, 4582600, 4795850, 5000000}));
EXPECT_EQ(getAcquisitionTimings(4_Hz, 2s),
std::vector<int64_t>({707150, 1224750, 1581150, 1870850}));
checkAcquisitionTimings(1_Hz, 3s);
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is indeed much better for readability, but i am debating with myself whether it doesn't decrease our coverage. Could there later be a bug that would go undetected? Should we add at least one test case that verifies the exact values as before? WDYT?

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah this makes sense, done (f480248)

@mum4k mum4k added waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. and removed waiting-for-review A PR waiting for a review. labels Jan 3, 2020
…te-limiter

Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
Signed-off-by: Otto van der Schaaf <oschaaf@we-amp.com>
@oschaaf
Copy link
Copy Markdown
Member Author

oschaaf commented Jan 3, 2020

@mum4k added f480248 based on your feedback, thanks!

@oschaaf oschaaf added waiting-for-review A PR waiting for a review. and removed waiting-for-changes A PR waiting for comments to be resolved and changes to be applied. labels Jan 3, 2020
@mum4k mum4k merged commit 9b45e0c into envoyproxy:master Jan 3, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

P3 waiting-for-review A PR waiting for a review.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants