Locality LB failover api#760
Conversation
rshriram
left a comment
There was a problem hiding this comment.
This looks fine by me. I will cleanup the language in a later PR. @louiscryan thoughts? The defaults are such that we failover from one subzone to all subzones in same zone, one zone to all zones in same region, and then to all regions.. This is the typical expected behavior for most setups. The explicit failover to a specific region allows people to maintain traffic within the same country if desired.
The load distribution is a different use case that says the user would like to spread the load across different zones/regions. This is not the same as weighted routing as it implies a traffic shift for code changes. locality load distribution is more for spreading load (manual control) to other zones.
| // If duplicated settings are present, then the first one will take effect. | ||
| repeated LocalityWeightSetting locality_weight_settings = 3; | ||
| // Locality load balancing settings. | ||
| message LocalitySetting{ |
There was a problem hiding this comment.
can we move this entire thing into mesh/v1alpha1/config.proto
There was a problem hiding this comment.
You mean a mesh-wide config instead of per service?
There was a problem hiding this comment.
Yep. mesh wide.. Its because this API is very new and we dont know what customers want. So lets keep it in mesh/v1alpha1/config.proto -- so that we can change it if need be
There was a problem hiding this comment.
+1 lets move this to mesh config
| // localitySettings: | ||
| // failover: | ||
| // - from: region1 | ||
| // to: region2 |
There was a problem hiding this comment.
is it possible to configure failover after X number of retries?
There was a problem hiding this comment.
No, this is lb stuff, not related to retry. On converse, retry may base on lb policy to choose which new host to try.
(https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/load_balancing/priority), plan to implement this by setting different priorities.
6ed6e63 to
f0992c2
Compare
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hzxuzhonghu, rshriram The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
Updated |
the failover can be implemented in Envoy by adding local locality endpoints in priotity 0, and failover locality endpoints in priority 1.
actually, whenever we specify locality routing with failover, have to specify outlier detection. because outlier detection will detect when local endpoints have failed and proxy traffic to priority 1.
By default, if no failover or distribute specified, The default failover policy is:
if endpoints in same subZone fails, we failover to all endpoints in same zone. if zone fails, we failover to all endpoints in same region. if region fails, we look at the destination rule to see if there is a specific failover region. If no failover region is specified, then we failover to all endpoints in all regions.