Skip to content

add all host map to priority set for fast host searching#17290

Merged
snowp merged 15 commits intoenvoyproxy:mainfrom
wbpcode:host_map
Aug 17, 2021
Merged

add all host map to priority set for fast host searching#17290
snowp merged 15 commits intoenvoyproxy:mainfrom
wbpcode:host_map

Conversation

@wbpcode
Copy link
Member

@wbpcode wbpcode commented Jul 12, 2021

Signed-off-by: wbpcode wbphub@live.com

Commit Message: add all host map to priority set for fast host searching
Additional Description:
Add all host map to priority set of main thread and then sync it to different worker threads. This host map will be used for fast host searching.

This is one of the most critical tasks related to #16698. Check #16698 and #17242 get more background info.

**This PR is still under active development, and related tests have not been added yet. It can only be used for early reviews. **

Risk Level: Low.
Testing: Wait.
Docs Changes: N/A.
Release Notes: N/A.

Signed-off-by: wbpcode <wbphub@live.com>
@wbpcode
Copy link
Member Author

wbpcode commented Jul 12, 2021

@mattklein123 Before proceeding with this work, I hope you can review the feasibility of this solution. 😃

An all host map is maintained in the priority set. And in the main thread, I added an additional mutable all host map to assist in updating the all host map. When the hosts set changes, this way will bring some extra memory copies on the main thread. But it ensures that no locks are added at all when all host map are shared by multiple threads. (on the worker thread, there is not even a need for redundant if judgments).

Signed-off-by: wbpcode <wbphub@live.com>
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for working on this. One question to get started.

/wait-any

}
});
// Update read only all host map in worker threads.
cluster_manager->thread_local_clusters_[info->name()]->priority_set_.setReadOnlyAllHostMap(map);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't this map just be calculated on the main thread and then passed by shared_ptr to all of the workers similar to how everything else is passed in the updateClusterMembership() call? It's unclear to me why this needs to be special cased like this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In an update after aggregation, updateClusterMembership() will be executed multiple times (once for each priority). The host map only needs to be updated once. Although it is not a problem to add it to the updateClusterMembership(), it will only be repeated several times to set the pointer of the host map.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it will only be repeated several times to set the pointer of the host map.

As long as the copy doesn't happen multiple times I don't think this matters? My preference would be to keep the code as simple as possible unless we have a known perf issue. This code is already super complicated.

Copy link
Member Author

@wbpcode wbpcode Jul 14, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code is already super complicated.

Yes it is. The code is complex enough. It took me a long time to understand the source code of ClusterManager and EDS.
I will modify and improve this PR first based on your suggestions.

Comment on lines +620 to +623
// Copy old read only host map to mutable host map.
mutable_all_host_map_ = read_only_all_host_map_ != nullptr
? std::make_shared<HostMap>(*read_only_all_host_map_)
: std::make_shared<HostMap>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be prohibitive for large clusters? Even if just one host changes, we end up doing an O(n) copy of this map.

Speaking of large clusters, since this would add both space and computational overhead to large clusters, should we make this configurable or only enabled when actually used in some way?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are already paying this copy cost for all the other data structures, right? (Not that it's good to add yet another one.). One option for this particular structure would be to use R/W locks and then we wouldn't need to actually do any copies on it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we do pay this when processing the EDS update as a whole, so we wouldn't make it that much worse.

I was originally thinking in terms of the update callbacks, most (all?) of which operate on the per priority level, so a change in P0 won't trigger updates to P1, P2, ...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, good point re: per-priority, though I'm not sure how much that matters in practice for most users. I think though that as long as we can replace the EDS private one with this one it's probably a wash?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep if this lets us replace the all_hosts_ map then I think we're good

Copy link
Member Author

@wbpcode wbpcode Jul 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep if this lets us replace the all_hosts_ map then I think we're good

@snowp Yep. This will replace the all_hosts_ map in EDS. This may still bring additional overhead, because each priority may be updated, causing the callback to be executed multiple times. In the original EDS, because batchUpdate was used, all_hosts_ would only be copied once. But this problem can be alleviated by setting merge_timeout.

Another way to avoid extra overhead is to use an additional memberUpdateCallback to synchronize read_only_all_host_map_ to the worker threads instead of directly reusing postThreadLocalClusterUpdate. This ensures that when batchUpdate is used, there is only one map copy at most. I think this is also a point that can be discussed.

Copy link
Member Author

@wbpcode wbpcode Jul 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are already paying this copy cost for all the other data structures, right? (Not that it's good to add yet another one.). One option for this particular structure would be to use R/W locks and then we wouldn't need to actually do any copies on it.

@mattklein123 I have considered using R/W locks, but I don't want to increase the cost of acquiring R locks during each request for occasional membership updates.

Signed-off-by: wbpcode <wbphub@live.com>
Signed-off-by: wbpcode <wbphub@live.com>
@wbpcode wbpcode changed the title [WIP] add all host map to priority set for fast host searching add all host map to priority set for fast host searching Jul 29, 2021
@wbpcode
Copy link
Member Author

wbpcode commented Jul 29, 2021

I think this PR is ready for a review. @mattklein123 😄

@wbpcode
Copy link
Member Author

wbpcode commented Jul 29, 2021

CI (GCC) is failed for the memory consumption of cluster. But I just added two smart pointers and a callback to the priority set. Do we need to increase the upper limit of memory in the test to ensure that CI passes? 🤔

GCC CI passed. 😂 It seems that the memory test is not very stable.

wbpcode added 2 commits July 30, 2021 00:58
Signed-off-by: wbpcode <wbphub@live.com>
Signed-off-by: wbpcode <wbphub@live.com>
@wbpcode
Copy link
Member Author

wbpcode commented Jul 30, 2021

/retest

@repokitteh-read-only
Copy link

Retrying Azure Pipelines:
Retried failed jobs in: envoy-presubmit

🐱

Caused by: a #17290 (comment) was created by @wbpcode.

see: more, trace.

Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks in general this looks like it's on the right track. I have a few comments and I would definitely like @snowp to take a look. Thank you!

/wait

overprovisioning_factor);

// Update read only all host map in worker threads.
cluster_entry->priority_set_.setReadOnlyAllHostMap(read_only_host_map);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this just be folded into the updateHosts() call?

Comment on lines +59 to +62
// When the configuration contains duplicate hosts, only the first one will be retained.
if (all_new_hosts.count(address->asString()) > 0) {
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an unrelated bug fix? Or is this somehow required with this new code?

Comment on lines +130 to +132
if (all_new_hosts.count(address->asString()) > 0) {
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same question?

Comment on lines -1491 to -1477
if (updated_hosts.count(host->address()->asString())) {
continue;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see is this basically just moved to a new location?

Copy link
Member Author

@wbpcode wbpcode Jul 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. I removed updated_hosts, so in order to ensure that the existing behavior not changed, I need to de-duplicate the hosts in other locations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a slight difference in doing the bookkeeping here vs the new location in that doing it here includes endpoints in PENDING_DYNAMIC_REMOVAL

That said I don't think I can come up with a case where that matters after https://github.com/envoyproxy/envoy/pull/14483/files, so this is probably fine

Comment on lines +566 to +581
// PrioritySet
const HostMapConstSharedPtr& readOnlyAllHostMap() const override {
// Check if the host set in the main thread PrioritySet has been updated.
if (mutable_all_host_map_ != nullptr) {
read_only_all_host_map_ = std::move(mutable_all_host_map_);
ASSERT(mutable_all_host_map_ == nullptr);
}
return read_only_all_host_map_;
}

// PrioritySetImpl
void setReadOnlyAllHostMap(HostMapConstSharedPtr) override {
// Using an external host map to update the read_only_all_host_map_ is not allowed in
// MainPrioritySetImpl.
NOT_IMPLEMENTED_GCOVR_EXCL_LINE;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per my other comment I wonder if this could be removed from the interface and just be part of the implementation?

Copy link
Member Author

@wbpcode wbpcode Jul 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, we can make the interface simpler. 🤔

Signed-off-by: wbpcode <wbphub@live.com>
@wbpcode wbpcode requested a review from snowp August 2, 2021 09:24
Copy link
Contributor

@snowp snowp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this change makes sense to me, just a few smaller comments

HostMapConstSharedPtr host_map = cm_cluster.cluster().prioritySet().crossPriorityHostMap();

tls_.runOnAllThreads([info = cm_cluster.cluster().info(), params = std::move(params),
add_or_update_cluster, load_balancer_factory, map = std::move(host_map)](
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Maybe call it host_map within the lambda as well? map seems a bit too generic

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get it.

LocalityWeightsConstSharedPtr locality_weights, const HostVector& hosts_added,
const HostVector& hosts_removed, uint64_t overprovisioning_factor) {
const HostVector& hosts_removed, uint64_t overprovisioning_factor,
const HostMapConstSharedPtr& cross_priority_host_map) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think usually we'd pass a shared ptr by value and std::move it to avoid copies where appropriate

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get it.

Comment on lines 45 to 48
// All host map for current resolve target. When we have multiple resolve targets, multiple
// targets may contain two different host objects with the same address. This host map cannot be
// replaced by the read only all host map in the priority set.
HostMap all_hosts_;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there anything that would prevent someone from using the shared map for strict DNS clusters? Seems like it would not be valid?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The shared map is still worked for strict DNS clusters. However, we cannot use shared map to replace this all_hosts_.

Because strict DNS cluster will remove duplicate hosts in the resolve target. But if multiple resolve targets contain hosts with the same address, these hosts will exist at the same time. As a global host map, shared map cannot reserve two different hosts with the same address for two different resolve targets, nor can it keep the logic of strict DNS cluster unchanged.
So shared map cannot replace all_hosts_ (note that all_hosts_ here is a member of resolve target).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess my concern would be around people trying to implement sticky LB for strict DNS but running into weird edge cases: suppose they use regular LB to select an endpoint from the STRICT_DNS cluster and wants to pin this lb selection

This works fine if there is only one host per ip within the cluster, but if this is configured with multiple resolve targets that resolve to the same ip, some of which have different metadata, you might run into a situation where the first LB selects a different endpoint than what the shared map gives you when you ask for the IP

I don't think we need to fix this, but perhaps some documentation around this limitation with the strict dns cluster would be good?

Copy link
Member Author

@wbpcode wbpcode Aug 3, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This works fine if there is only one host per ip within the cluster, but if this is configured with multiple resolve targets that resolve to the same ip, some of which have different metadata, you might run into a situation where the first LB selects a different endpoint than what the shared map gives you when you ask for the IP

Yes, this is a known limitation. I will add some new comments to illustrate this problem.

Comment on lines -1491 to -1477
if (updated_hosts.count(host->address()->asString())) {
continue;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a slight difference in doing the bookkeeping here vs the new location in that doing it here includes endpoints in PENDING_DYNAMIC_REMOVAL

That said I don't think I can come up with a case where that matters after https://github.com/envoyproxy/envoy/pull/14483/files, so this is probably fine


void batchHostUpdate(BatchUpdateCb& callback) override;

const HostMapConstSharedPtr& crossPriorityHostMap() const override {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this return by value? Seems a bit safer so nothing stores this by ref and tries to access it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think if someone needs to store it, then a copy of the shared pointer should be created from this ref instead of storing the reference directly. 🤔

According to my imagination, this method has two main usage scenarios:

  • First, in the main thread, we use this method to get the latest host map. At this point, as the code shows, I will create a copy of the shared pointer.
  • Second, in the worker threads, we will obtain a reference to the host map through this method, and then perform a host search, but will not save this reference. In this case, returning reference can reduce unnecessary atomic operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is this in preparation for another usage where we access it without taking a copy from the worker threads? All the usages in this PR seems to make a copy. In general forcing a copy via the API makes the API a bit safer since it means the caller can't mess up by trying to access the ref later on, so it makes it less likely for bugs to happen later

If we are trying to allow for ref access via the ptr I think we need to add some comments around the lifetime of the returned reference since it is different from the lifetime of the PSet

Another strategy would be to have a std::shared_ptr<T> and a const T& accessor, which lets consumers of the API choose between a call that has very clear ownership semantics and one that requires some special attention

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So is this in preparation for another usage where we access it without taking a copy from the worker threads?

Yes, it is preparation for Sticky LB.

🤔 My be we can return shared pointer directly. As long as we don't call this method frequently, there won't be too many atomic operations.


/**
* Specialized PrioritySetImpl designed for the main thread. It will update and maintain the read
* only all host map when the hosts set changes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

host set

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get it.

// Check if the host set in the main thread PrioritySet has been updated.
if (mutable_cross_priority_host_map_ != nullptr) {
const_cross_priority_host_map_ = std::move(mutable_cross_priority_host_map_);
ASSERT(mutable_cross_priority_host_map_ == nullptr);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This ASSERT seems a bit unnecessary since we null it out on the previous line, but I don't feel that strongly about it

Copy link
Member Author

@wbpcode wbpcode Aug 2, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because I need to modify two data members (mutable_cross_priority_host_map_ & const_cross_priority_host_map_) in a const method (crossPriorityHostMap() const), I added ASSERT to ensure that the move is executed correctly.

Comment on lines +2296 to +2297
// Only mutable host map can be updated directly. Read only host map will not be updated before
// 'crossPriorityHostMap' is called.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not be updated until

I think we also prefer backticks (`) instead of quotes (') when referencing function names

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get it.

mattklein123
mattklein123 previously approved these changes Aug 2, 2021
Copy link
Member

@mattklein123 mattklein123 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks this LGTM at a high level. I will defer remaining review to @snowp who I think is better able to review this code than I am.

@mattklein123 mattklein123 removed their assignment Aug 2, 2021
Signed-off-by: wbpcode <wbphub@live.com>
@snowp
Copy link
Contributor

snowp commented Aug 4, 2021

I would bump the limit. This PR is going to slightly increase the memory usage for static clusters, but not substantially so for EDS clusters (since we're replacing the existing map) which I imagine is what the main cluster type that is used for large deployments.

Thanks for raising!

/wait

@wbpcode
Copy link
Member Author

wbpcode commented Aug 11, 2021

@snowp So, what should I do next with this PR? 🤔

Signed-off-by: wbpcode <wbphub@live.com>
@snowp
Copy link
Contributor

snowp commented Aug 11, 2021

I'd suggest increasing the limits just enough to make this PR pass consistently

@wbpcode
Copy link
Member Author

wbpcode commented Aug 12, 2021

I'd suggest increasing the limits just enough to make this PR pass consistently

Oh, I see. It seems that I misunderstood the meaning of your last comment. 🤣

@wbpcode
Copy link
Member Author

wbpcode commented Aug 12, 2021

I executed the memory consumption case 500 times and get 1000 result. Here is the statistical results.

Memory   Count(s)
<40000    509
40051      234
40133      135
40215      69
40297      33
40379      15
40543      3
40624      2

@snowp @mattklein123 From the results, the fluctuation range is very large. Finally, I adjusted the limit to 40650 to ensure that CI will not be affected in most cases.

Signed-off-by: wbpcode <wbphub@live.com>
@snowp
Copy link
Contributor

snowp commented Aug 12, 2021

Do we have any likely explanation for the large fluctuations? Are they coming from this change or is the fluctuation already there and we are just increasing it by a constant amount?

cc @jmarantz as well (for when he's back)

@wbpcode
Copy link
Member Author

wbpcode commented Aug 13, 2021

Do we have any likely explanation for the large fluctuations? Are they coming from this change or is the fluctuation already there and we are just increasing it by a constant amount?

cc @jmarantz as well (for when he's back)

This PR should not bring such fluctuations. But I can try to perform the same test in main to see the distribution of memory consumption.

@wbpcode
Copy link
Member Author

wbpcode commented Aug 13, 2021

@snowp It seems that the fluctuations has always been great. Here are some detailed distributions.

Before this PR:

      4 39291
     18 39373
      3 39454
    184 39455
      1 39536
    291 39537
      3 39618
    235 39619
      1 39700
    128 39701
     78 39783
      1 39864
     31 39865
     13 39947
      3 40029
      4 40111
      2 40192

After this PR:

     3 39723
     32 39805
      1 39885
      2 39886
    170 39887
      1 39968
    302 39969
      2 40050
    232 40051
    135 40133
      1 40214
     67 40215
     33 40297
     15 40379
      1 40461
      3 40543
      2 40624

@jmarantz
Copy link
Contributor

For the record, the fluctuations started with some new version of tcmalloc we pulled in at some point. Prior to that the measurements were always exact.

Thanks for providing the distribution. Looks like bumping up the per-cluster cost by 500 bytes is needed and (I assume) warranted by the benefits of this change.

@wbpcode
Copy link
Member Author

wbpcode commented Aug 15, 2021

I get a point and may be we can reduce some of memory consumption.

Signed-off-by: wbpcode <wbphub@live.com>
@wbpcode
Copy link
Member Author

wbpcode commented Aug 15, 2021

The latest memory consumption distribution (The latest commit reduces memory consumption by nearly 200~300 bytes for each cluster ):

     3 39611
     23 39693
    174 39775
      1 39776
      1 39856
    290 39857
      3 39938
    235 39939
      1 40020
    134 40021
     82 40103
     29 40185
     19 40267
      5 40349

@snowp
Copy link
Contributor

snowp commented Aug 17, 2021

Thanks for reducing the memory footprint! If I understand correctly you removed one of the callbacks and instead reused an existing one? Just trying to make sure I understand :)

@wbpcode
Copy link
Member Author

wbpcode commented Aug 17, 2021

Thanks for reducing the memory footprint! If I understand correctly you removed one of the callbacks and instead reused an existing one? Just trying to make sure I understand :)

About the same. I removed an unnecessary callback and did what the callback needed to do directly in updateHosts (building the cross Priority host map).

@snowp snowp merged commit 76df95e into envoyproxy:main Aug 17, 2021
mpuncel added a commit to mpuncel/envoy that referenced this pull request Aug 19, 2021
* main:
  Fix for fuzz tests failing due to invalid corpus paths (envoyproxy#17767)
  kafka: fix integration test (envoyproxy#17764)
  Fix typo in cluster.proto (envoyproxy#17755)
  cluster manager: add drainConnections() API (envoyproxy#17747)
  kafka broker filter: move to contrib (envoyproxy#17750)
  quiche: switch external dependency to github (envoyproxy#17732)
  quiche: implement stream idle timeout in codec (envoyproxy#17674)
  Update c-ares to 1.17.2 (envoyproxy#17704)
  Fix dns resolve fuzz bug (envoyproxy#17107)
  Remove members that shadow members of the base class (envoyproxy#17713)
  thrift proxy: missing parts from the previous PR (envoyproxy#17668)
  thrift-proxy: cleanup ConnectionManager::ActiveRpc (envoyproxy#17734)
  listener: extra warning for deprecated use_proxy_proto field (envoyproxy#17736)
  kafka: add support for metadata request in mesh-filter (envoyproxy#17597)
  upstream: add all host map to priority set for fast host searching (envoyproxy#17290)
  Remove the support for `hidden_envoy_deprecated_per_filter_config` (envoyproxy#17725)
  tls: SDS support for private key providers (envoyproxy#16737)
  bazel: update rules_foreign_cc (envoyproxy#17445)

Signed-off-by: Michael Puncel <mpuncel@squareup.com>
dio pushed a commit that referenced this pull request Jan 12, 2022
Introduce a new stateful session extension. This is a change after #17848 #17290. 

In #17290, we added a new cross-priority host map for fast host searching. In #17848, we extend `LoadBalancerContext` interface to provide an override host and to select the upstream host by the override host.

Finally, in this change, we expand a new API to allow users to extract the state of the session from the request and change the result of load balancing by setting the override host value.

Related doc: https://docs.google.com/document/d/1IU4b76AgOXijNa4sew1gfBfSiOMbZNiEt5Dhis8QpYg/edit?usp=sharing.

Risk Level: Medium
Testing: Added
Docs Changes: Added
Release Notes: Added
Platform-Specific Features: N/A.

Signed-off-by: wbpcode <wbphub@live.com>
joshperry pushed a commit to joshperry/envoy that referenced this pull request Feb 13, 2022
Introduce a new stateful session extension. This is a change after envoyproxy#17848 envoyproxy#17290. 

In envoyproxy#17290, we added a new cross-priority host map for fast host searching. In envoyproxy#17848, we extend `LoadBalancerContext` interface to provide an override host and to select the upstream host by the override host.

Finally, in this change, we expand a new API to allow users to extract the state of the session from the request and change the result of load balancing by setting the override host value.

Related doc: https://docs.google.com/document/d/1IU4b76AgOXijNa4sew1gfBfSiOMbZNiEt5Dhis8QpYg/edit?usp=sharing.

Risk Level: Medium
Testing: Added
Docs Changes: Added
Release Notes: Added
Platform-Specific Features: N/A.

Signed-off-by: wbpcode <wbphub@live.com>
Signed-off-by: Josh Perry <josh.perry@mx.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants