Conversation
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tallen@lyft.com>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Lizan Zhou <lizan@tetrate.io>
This reverts commit 1976891. Signed-off-by: Tony Allen <tony@allen.gg>
Signed-off-by: Tony Allen <tony@allen.gg>
|
/wait-any |
|
Windows CI passed. Removing wait. |
|
@snowp it came up in conversation that you might be interested in taking a look at this. No expectation, just FYI. |
mattklein123
left a comment
There was a problem hiding this comment.
Sorry for the delay. Flushing out some comments. I think this can be split into multiple PRs. Can you do the controller and unit tests, the evaluators, etc.? Would be good to flush out the TLS/global controller question first though. Thank you, this is awesome!
/wait
| // indicate a failed request must be explicitly specified if not relying on the default | ||
| // values. | ||
| message DefaultEvaluationCriteria { | ||
| // If HTTP statuses are unspecified, defaults to 5xx. |
There was a problem hiding this comment.
Can you comment on valid ranges here and make sure we have code checks for them?
| // If HTTP statuses are unspecified, defaults to 5xx. | ||
| repeated type.v3.Int32Range http_status = 1; | ||
|
|
||
| // GRPC status codes to consider as request successes. If unspecified, defaults to "OK". |
There was a problem hiding this comment.
Is there any link that we can have for the code to uint mapping?
| // The time window over which the success rate is calculated. The window is rounded to the nearest | ||
| // second. Defaults to 120s. |
There was a problem hiding this comment.
Can you clarify that this is a sliding window?
| buildHeaders(final_headers, trailers); | ||
| int rc = | ||
| nghttp2_submit_trailer(parent_.session_, stream_id_, &final_headers[0], final_headers.size()); | ||
| int rc = nghttp2_submit_trailer(parent_.session_, stream_id_, final_headers.data(), |
There was a problem hiding this comment.
This was failing in CI because of some scenario where final_headers had a size of 0, so it was causing undefined behavior. Calling vector::data() gives us the same information, but circumvents the undefined behavior in the empty vector case.
I have no idea how my change could have caused a regression here since I don't touch really touch anything unrelated to the new filter, so I just assumed it would be fine to do this since it's a better way of getting that pointer.
| virtual ~AdmissionControlFilterConfig() = default; | ||
|
|
||
| virtual ThreadLocalController& getController() const { | ||
| return tls_->getTyped<ThreadLocalControllerImpl>(); |
There was a problem hiding this comment.
I don't remember if/when we discussed this, but what's the tradeoff for a global controller vs. per worker? Would it be better to go with global for better accuracy similar to what we do for adaptive concurrency?
There was a problem hiding this comment.
Adaptive concurrency required global control because of the way it took measurements. We needed high levels of accuracy because of the percentile calculations- we would be in danger of missing tail latencies. We also needed to globally limit the concurrency across all worker threads since it would no longer mean that limiting concurrency (on one thread) resulted in decreased load on the local host. It would almost be the same as using adaptive concurrency on the egress. There's also complications around synchronization of the measurement windows, etc. Global just made more sense.
For this filter, the measurements aren't as finicky and making it thread-local would simply just decrease the number of samples. This is fine, since there's not really a control loop that other threads can interfere with (not sure if I'm using that correctly, but I mean we're not making a change and then measuring its effect). The decision-making here can be decentralized and we still get roughly the same results.
The benefits to thread-local controllers is obviously that we don't need to take any locks on the deque/structures that store the sampling history. The drawbacks are that it's a bit more complicated to implement and we need a copy of the sampling history deque/structures for each thread. I decided that the memory overhead is minimal, so it's worth it to make it thread-local. In cases where memory utilization might raise an eyebrow, we'd be at high risk for lock contention.
There was a problem hiding this comment.
That's fine. Add some comments.
There was a problem hiding this comment.
I think one potential drawback with TLS is that incoming traffic isn't necessarily evenly distributed between the workers, so it could result in cases where overall we're rejecting some % of traffic, but individual clients that are connected to workers that haven't seen much traffic might be able to pass through without any rejections.
You could imagine 3 clients: client 1 and 2 are connected to worker A and client 3 to worker B. Client 1 starts sending traffic thats failing, triggering throttling. Client 2 and 3 are both sending identical traffic, but only client 2 sees throttling because it happened to be handled by the worker that is also handling client 1.
That said I think TLS is fine assuming we document the behavior.
| // To give a new value to the Cleanup() variable, we must cancel it first to avoid triggering | ||
| // the function it wraps. | ||
| deferred_record_failure_->cancel(); |
There was a problem hiding this comment.
It's been quite a while since I wrote that piece, but I recall that it was happening in the unit tests. It couldn't happen in practice.
| // Status code must be sent in trailers. | ||
| ASSERT(grpc_status.has_value()); |
There was a problem hiding this comment.
No guarantee the upstream does this. Don't use ASSERT here and handle errors (and test).
There was a problem hiding this comment.
My reason for this was that I read this part in the gRPC over HTTP2 protocol (emphasis mine):
Response-Headers & Trailers-Only are each delivered in a single HTTP2 HEADERS frame block. Most responses are expected to have both headers and trailers but Trailers-Only is permitted for calls that produce an immediate error. Status must be sent in Trailers even if the status code is OK.
I interpreted that to mean that we can always expect a status. This should definitely be handled differently, I'm just unsure of what to do in this case- maybe return a INVALID_ARGUMENT status?
There was a problem hiding this comment.
I would synthesize a failure value and just increment it (or something like that). It doesn't really matter.
| deferred_record_failure_ = | ||
| std::make_unique<Cleanup>([this]() { config_->getController().recordFailure(); }); |
There was a problem hiding this comment.
Perf nit: this really shouldn't need a heap allocation. Since I'm pretty sure there must not exist a cleanup already since this is per filter, can you just member allocate a Cleanup inline in the constructor?
| const envoy::extensions::filters::http::admission_control::v3alpha::AdmissionControl& config, | ||
| const std::string& stats_prefix, Server::Configuration::FactoryContext& context) { | ||
|
|
||
| std::string prefix = stats_prefix + "admission_control."; |
| response_evaluator = std::make_unique<DefaultResponseEvaluator>(config.default_eval_criteria()); | ||
| break; | ||
| case AdmissionControlProto::EvaluationCriteriaCase::EVALUATION_CRITERIA_NOT_SET: | ||
| default: |
There was a problem hiding this comment.
I don't think you need a default?
There was a problem hiding this comment.
I don't, but this catches a scenario where a new evaluator is implemented and configured, but not handled here. Otherwise, we'd catch it by dereferencing a null pointer somewhere.
There was a problem hiding this comment.
You will only allow the compiler to catch it if you remove default, which is what you want.
Signed-off-by: Tony Allen <tony@allen.gg>
|
/wait-any |
Signed-off-by: Tony Allen <tony@allen.gg>
|
/wait-any |
|
This pull request has been automatically marked as stale because it has not had activity in the last 7 days. It will be closed in 7 days if no further activity occurs. Please feel free to give a status update now, ping for review, or re-open when it's ready. Thank you for your contributions! |
snowp
left a comment
There was a problem hiding this comment.
Thanks this looks pretty good to me. A few comments from looking over the implementation
| virtual ~AdmissionControlFilterConfig() = default; | ||
|
|
||
| virtual ThreadLocalController& getController() const { | ||
| return tls_->getTyped<ThreadLocalControllerImpl>(); |
There was a problem hiding this comment.
I think one potential drawback with TLS is that incoming traffic isn't necessarily evenly distributed between the workers, so it could result in cases where overall we're rejecting some % of traffic, but individual clients that are connected to workers that haven't seen much traffic might be able to pass through without any rejections.
You could imagine 3 clients: client 1 and 2 are connected to worker A and client 3 to worker B. Client 1 starts sending traffic thats failing, triggering throttling. Client 2 and 3 are both sending identical traffic, but only client 2 sees throttling because it happened to be handled by the worker that is also handling client 1.
That said I think TLS is fine assuming we document the behavior.
| // We default to all 5xx codes as request failures. | ||
| http_success_fns_.emplace_back([](uint64_t status) { return status < 500; }); |
There was a problem hiding this comment.
these two lines together got me real confused since the comment talks about what we consider request failures, then the next line defines the success condition instead. Maybe reword the comment?
| grpc_success_codes_.emplace_back(status); | ||
| } | ||
| } else { | ||
| grpc_success_codes_ = decltype(grpc_success_codes_)({ |
There was a problem hiding this comment.
i would think you can elide the type here? afaik it would resolve to the initializer_list ctor
| grpc_success_codes_.emplace_back(status); | ||
| } | ||
| } else { | ||
| grpc_success_codes_ = decltype(grpc_success_codes_)({ |
There was a problem hiding this comment.
docs say you default to Ok, this has much more than that
|
|
||
| private: | ||
| std::vector<std::function<bool(uint64_t)>> http_success_fns_; | ||
| std::vector<uint64_t> grpc_success_codes_; |
There was a problem hiding this comment.
you probably want this to be uint32_t, both enumToInt and the extracted status is 32 bit
| public ThreadLocal::ThreadLocalObject { | ||
| public: | ||
| ThreadLocalControllerImpl(TimeSource& time_source, std::chrono::seconds sampling_window); | ||
| ~ThreadLocalControllerImpl() override = default; |
|
|
||
| void ThreadLocalControllerImpl::maybeUpdateHistoricalData() { | ||
| // Purge stale samples. | ||
| while (!historical_data_.empty() && ageOfOldestSample() >= sampling_window_) { |
There was a problem hiding this comment.
if you want to same some cycles here i think you can compute the cutoff timestamp and compare each sample with that instead of computing the diff between sampling time and current time for each sample
might not matter, it depends on how many samples are ejected at a time
|
|
||
| // It's possible we purged stale samples from the history and are left with nothing, so it's | ||
| // necessary to add an empty entry. We will also need to roll over into a new entry in the | ||
| // historical data if we've exceeded the time specified by the granularity. |
There was a problem hiding this comment.
its not obvious to me how this granularity comes into play, maybe add some detail?
|
@mattklein123 requested this PR be split, so I'm closing this one out and submitting another that incorporates the current comments and omits the thread-local controller. |
This is an implementation of success-rate based admission control via an HTTP filter. Based on the success rate, the filter will probabilistically reject outbound requests. Motivation and some background on this filter's calculations can be found in #9658.
The filter tracks request success rate for each worker thread and uses the information to reject requests with some probability dictated by the SR (in some rolling window) before forwarding to the upstream.This removes the need for any locks such as those found in the local ratelimit filter and the adaptive concurrency filter.
To avoid taking any locks, the success is calculated per-thread. We track the total request count and the total number of successes for use in the following rejection probability:
These values are accumulated each second and inserted into a deque that contains the per-second accumulated values for the entire rolling window. This allows us to efficiently phase out stale SR data as it is no longer in the sliding window.
Progress on #9658 and a resurfacing of the work in #10230.