thrift: supported setting max requests for per downstream connection#15125
thrift: supported setting max requests for per downstream connection#15125zuercher merged 9 commits intoenvoyproxy:mainfrom
Conversation
Signed-off-by: wbpcode <comems@msn.com>
|
I feel that this requirement is not universal. Why not do this by removing the listener from the control plane? @wbpcode |
That's also an option. However, if there are multiple dubbo services in the cluster, although some of them have generated the corresponding listener for it, such as 1.2.3.4:10000, there may still be traffic accessing other dubbo services that need to be proxied through 0.0.0.0:10000. In addition, we have a similar configuration for upstream connections, and I personally don't see any harm in making the same restrictions for downstream connections. 😄 |
|
@rgs1 Would you mind helping to give this pr a review or some suggestions? 😃 |
| std::list<ThriftFilters::FilterFactoryCb> filter_factories_; | ||
| const bool payload_passthrough_; | ||
|
|
||
| uint64_t max_requests_per_connection_ = UINT64_MAX; |
There was a problem hiding this comment.
nit:
uint64_t max_requests_per_connection_{};
initializing this to zero and skipping the limit check when zero is probably easier (and more common across the code base).
| TimeSource& time_source_; | ||
|
|
||
| // The number of requests accumulated on the current connection. A connection is processed by only | ||
| // one thread, so there is no need to consider the thread safety of the count. |
There was a problem hiding this comment.
I think we can drop the 2nd sentence since it's a common assumption that conn managers are per thread.
|
|
||
| if (parent_.accumulated_requests_ >= parent_.config_.maxRequestsPerConnection()) { | ||
| parent_.read_callbacks_->connection().readDisable(true); | ||
| parent_.requests_overflow_ = true; |
There was a problem hiding this comment.
can we add a stat for this and increment it here?
| EXPECT_EQ(0U, store_.counter("test.response_invalid_type").value()); | ||
| EXPECT_EQ(1U, store_.counter("test.response_success").value()); | ||
| EXPECT_EQ(0U, store_.counter("test.response_error").value()); | ||
| } |
There was a problem hiding this comment.
Can we also test:
- limit set but not reached
- limit set, reached, cleared and next request goes through
And also test the suggested stat from above is properly incremented in each case.
rgs1
left a comment
There was a problem hiding this comment.
Thanks! We are going to use this too, so excited to see this feature.
Could you also add a comment about this new feature in docs/root/configuration/listeners/network_filters/thrift_proxy_filter.rst?
cc: @fishcakez to get additional feedback
|
/lgtm api |
Signed-off-by: wbpcode <comems@msn.com>
Signed-off-by: wbpcode <comems@msn.com>
Signed-off-by: wbpcode <comems@msn.com>
docs/root/configuration/listeners/network_filters/thrift_proxy_filter.rst
Outdated
Show resolved
Hide resolved
| @@ -1,5 +1,6 @@ | |||
| #pragma once | |||
|
|
|||
| #include <cstdint> | |||
There was a problem hiding this comment.
👍 yes, it is not needed. I just forget to remove it.
| TimeSource& time_source_; | ||
|
|
||
| // The maximum number of requests remaining to be processed on the current connection. | ||
| uint64_t remaining_streams_{}; |
There was a problem hiding this comment.
Can we do this the other way around? Having a counter of seen requests and a the limit in a separate var? E.g.:
uint64_t requests_{};
uint64_t max_requests_{};
If max_requests_ is 0, there are no limits. I rather avoid depending on std::numeric_limits<uint64_t>::max().
There was a problem hiding this comment.
I have no objection. The reason for choosing current method is to reduce one if judgment. If using this method, we need to determine whether max_requests is 0, and then determine whether the accumulated request value is greater than max_requests.
| for (size_t i = 0; i < 4; i++) { | ||
| mock_new_connection(); | ||
| EXPECT_EQ(5, sendSomeThriftRequest(6)); | ||
| } |
There was a problem hiding this comment.
In order to simulate the situation of multiple disconnections and observe whether the stat value grows correctly.
rgs1
left a comment
There was a problem hiding this comment.
Thanks for the updates, left a few more comments.
Signed-off-by: wbpcode <comems@msn.com>
| } | ||
|
|
||
| // Return the number of requests actually sent. | ||
| uint32_t sendSomeThriftRequest(uint32_t request_number) { |
There was a problem hiding this comment.
nit: uint32_t sendRequests(uint32_t count) {
| EXPECT_EQ(0U, store_.counter("test.response_exception").value()); | ||
| EXPECT_EQ(0U, store_.counter("test.response_invalid_type").value()); | ||
| EXPECT_EQ(50U, store_.counter("test.response_success").value()); | ||
| EXPECT_EQ(0U, store_.counter("test.response_error").value()); |
There was a problem hiding this comment.
should we check here if cx_destroy_local_with_active_rq was incremented? I think it probably is incremented when a close happens after max requests is reached.
rgs1
left a comment
There was a problem hiding this comment.
This looks great, thanks! I have one final nit and one question about whether we should check if cx_destroy_local_with_active_rq is incremented after a local connection close. Other than that, LGTM.
(I also added some comments previously and deleted them, since I realized I was wrong)
|
The linux multiarch failure is transient, happened on other PRs too. |
|
@rgs1 The advantage of this is that when the connection is disconnected, we don't have to worry about those requests that exceed the limit being reset and introducing idempotent problems. Because those requests that exceed the limit will not be forwarded by the thrift proxy at all. |
Signed-off-by: wbpcode <comems@msn.com>
Ok makes sense. |
|
You'll want to merge main once #15238 lands. |
|
@rgs1 main merged. |
Signed-off-by: wbpcode comems@msn.com
Commit Message: thrift: supported setting max requests for per downstream connection
Additional Description:
Supported setting max requests for per downstream connection. By setting max_requests_per_connection, Envoy can actively disconnect the thrift client after reaching a certain number of requests.
Due to the limitations of the thrift protocol itself, we cannot achieve a clean disconnection.
Check #14560 get more information.
Risk Level: Normal
Testing: Added
Docs Changes: N/A
Release Notes: Added