Skip to content

Optimize AsyncClientManager by moving hashing to control plane#29199

Merged
ggreenway merged 9 commits intoenvoyproxy:mainfrom
DiazAlan:AsyncClientManagerOptimization
Sep 6, 2023
Merged

Optimize AsyncClientManager by moving hashing to control plane#29199
ggreenway merged 9 commits intoenvoyproxy:mainfrom
DiazAlan:AsyncClientManagerOptimization

Conversation

@DiazAlan
Copy link
Copy Markdown
Contributor

@DiazAlan DiazAlan commented Aug 23, 2023

Commit Message: Optimize AsyncClientManager by moving hashing to control plane
Additional Description: This change removes the hashing of the config away from the data plane on every request to the control plane when the filter callback is created. This is accomplished by creating a wrapper object for the config which precomputes the hash and also serializes it to string ahead of time so that calls to the AsyncClientManager with that wrapper can avoid unnecessary overhead.
Risk Level: Low
Testing: Changed some ext_authz tests to account for changes
Added a benchmark test that observed the following improvement
Benchmark Time CPU Iterations
testGetOrCreateAsyncClientWithConfig 27199 us 27182 us 1
testGetOrCreateAsyncClientWithHashConfig 15943 us 15943 us 1

Signed-off-by: AlanDiaz <diazalan@google.com>
@repokitteh-read-only
Copy link
Copy Markdown

As a reminder, PRs marked as draft will not be automatically assigned reviewers,
or be handled by maintainer-oncall triage.

Please mark your PR as ready when you want it to be reviewed!

🐱

Caused by: #29199 was opened by DiazAlan.

see: more, trace.

Signed-off-by: AlanDiaz <diazalan@google.com>
@DiazAlan
Copy link
Copy Markdown
Contributor Author

@ggreenway This is a follow up from what we talked about on 28272. This PR wraps the precomputed hash and config and sends that to the async client manager. I haven't changed the other filters yet as I want to get feedback on the initial implementation of this. If this looks good and passes, I'll make a follow up PR for the other filters.

Signed-off-by: AlanDiaz <diazalan@google.com>
Copy link
Copy Markdown
Contributor

@pradeepcrao pradeepcrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for working on this Alan!

const envoy::config::core::v3::GrpcService& config, Stats::Scope& scope,
bool skip_cluster_check) {
RawAsyncClientSharedPtr client = raw_async_client_cache_->getCache(config);
const GrpcServiceHashKeyWrapper& key_wrapper = GrpcServiceHashKeyWrapper(config);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a reference to a temporary object here.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, what do you think about replacing key_wrapper with config_with_hash_key?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Also changed GrpcServiceHashKeyWrapper to GrpcServiceConfigWithHashKey.

private:
void evictEntriesAndResetEvictionTimer();
struct CacheEntry {
CacheEntry(const envoy::config::core::v3::GrpcService& config,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not just have CacheEntry accept a GrpcServiceHashKeyWrapper instead of a GrpcService?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

MessageUtil>
lru_map_;
LruList lru_list_;
absl::flat_hash_map<GrpcServiceHashKeyWrapper, LruList::iterator> lru_map_wrapper_;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: maybe keep the old name (lru_map_)? The new name made more sense when you had 2 maps in your old implementation.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Stats::Scope& scope, bool skip_cluster_check) PURE;

virtual RawAsyncClientSharedPtr
getOrCreateRawAsyncClientWithWrapper(const GrpcServiceHashKeyWrapper& grpc_service,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Once you've made the change to all filters, it'll be good to deprecate getOrCreateRawAsyncClient without the wrapper, and change the name of this method to getOrCreateRawAsyncClient. Maybe add a comment to that effect?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also just added a comment to the new getOrCreateRawAsyncClient that better describes it's use.


friend bool operator==(const GrpcServiceHashKeyWrapper& lhs,
const GrpcServiceHashKeyWrapper& rhs) {
return Protobuf::util::MessageDifferencer::Equivalent(lhs.config_, rhs.config_);
Copy link
Copy Markdown
Contributor

@pradeepcrao pradeepcrao Aug 24, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is scope for optimisation here. Before the call Protobuf::util::MessageDifferencer::Equivalent, short circuit and return false if pre_computed_hash_ values are different.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

AlanDiaz added 2 commits August 24, 2023 15:39
Signed-off-by: AlanDiaz <diazalan@google.com>
Signed-off-by: AlanDiaz <diazalan@google.com>
@DiazAlan DiazAlan marked this pull request as ready for review August 24, 2023 19:46
@DiazAlan DiazAlan requested a review from ggreenway as a code owner August 24, 2023 19:46
@RyanTheOptimist
Copy link
Copy Markdown
Contributor

/assign @pradeepcrao

@DiazAlan
Copy link
Copy Markdown
Contributor Author

/retest

Copy link
Copy Markdown
Contributor

@pradeepcrao pradeepcrao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great Alan!

Could you please add a unit test for GrpcServiceConfigWithHashKey? Also, I feel it might be easier (and better) to write a performance benchmark that compares getOrCreateRawAsyncClientWithHashKey with getOrCreateRawAsyncClient than try to load test this.


class GrpcServiceConfigWithHashKey {
public:
GrpcServiceConfigWithHashKey(const envoy::config::core::v3::GrpcService& config)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Please mark constructor as explicit.

@phlax
Copy link
Copy Markdown
Member

phlax commented Aug 29, 2023

@DiazAlan i think this may be waiting for further iteration

/wait

Signed-off-by: AlanDiaz <diazalan@google.com>
Signed-off-by: AlanDiaz <diazalan@google.com>
@phlax
Copy link
Copy Markdown
Member

phlax commented Sep 1, 2023

/retest

ggreenway
ggreenway previously approved these changes Sep 5, 2023
Copy link
Copy Markdown
Member

@ggreenway ggreenway left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! Thanks!

Signed-off-by: AlanDiaz <diazalan@google.com>
@DiazAlan
Copy link
Copy Markdown
Contributor Author

DiazAlan commented Sep 5, 2023

Made a benchmark test to see the difference. Each test using getOrCreateRawAsyncClient 1000 times for each implementation. Here were the results.
Benchmark Time CPU Iterations
testGetOrCreateAsyncClientWithConfig 27199 us 27182 us 1
testGetOrCreateAsyncClientWithHashConfig 15943 us 15943 us 1

Signed-off-by: AlanDiaz <diazalan@google.com>
@ggreenway ggreenway enabled auto-merge (squash) September 5, 2023 17:18
@DiazAlan
Copy link
Copy Markdown
Contributor Author

DiazAlan commented Sep 5, 2023

/retest

1 similar comment
@DiazAlan
Copy link
Copy Markdown
Contributor Author

DiazAlan commented Sep 6, 2023

/retest

@ggreenway ggreenway merged commit 20676d7 into envoyproxy:main Sep 6, 2023
@DiazAlan DiazAlan deleted the AsyncClientManagerOptimization branch September 7, 2023 19:08
@DiazAlan DiazAlan mentioned this pull request Sep 8, 2023
ggreenway pushed a commit that referenced this pull request Sep 11, 2023
Change API call on ext_proc to more efficient new implementation. Following the PR #29199, we are replacing the API call in ext_proc so it's not hashing the whole config proto on each request.

Signed-off-by: AlanDiaz <diazalan@google.com>
ashishb-90 pushed a commit to solo-io/envoy-fork that referenced this pull request Nov 30, 2023
Change API call on ext_proc to more efficient new implementation. Following the PR envoyproxy#29199, we are replacing the API call in ext_proc so it's not hashing the whole config proto on each request.

Signed-off-by: AlanDiaz <diazalan@google.com>
ashishb-90 pushed a commit to solo-io/envoy-fork that referenced this pull request Nov 30, 2023
…proxy#29199)

This change removes the hashing of the config away from the data plane on every request to the control plane when the filter callback is created. This is accomplished by creating a wrapper object for the config which precomputes the hash and also serializes it to string ahead of time so that calls to the AsyncClientManager with that wrapper can avoid unnecessary overhead.

Signed-off-by: AlanDiaz <diazalan@google.com>
ashishb-90 pushed a commit to solo-io/envoy-fork that referenced this pull request Nov 30, 2023
…proxy#29199)

This change removes the hashing of the config away from the data plane on every request to the control plane when the filter callback is created. This is accomplished by creating a wrapper object for the config which precomputes the hash and also serializes it to string ahead of time so that calls to the AsyncClientManager with that wrapper can avoid unnecessary overhead.

Signed-off-by: AlanDiaz <diazalan@google.com>
ashishb-90 pushed a commit to solo-io/envoy-fork that referenced this pull request Nov 30, 2023
Change API call on ext_proc to more efficient new implementation. Following the PR envoyproxy#29199, we are replacing the API call in ext_proc so it's not hashing the whole config proto on each request.

Signed-off-by: AlanDiaz <diazalan@google.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants