Skip to content

Locality LB failover api#760

Merged
istio-testing merged 3 commits intoistio:release-1.1from
hzxuzhonghu:locality-lb-api
Jan 21, 2019
Merged

Locality LB failover api#760
istio-testing merged 3 commits intoistio:release-1.1from
hzxuzhonghu:locality-lb-api

Conversation

@hzxuzhonghu
Copy link
Member

@hzxuzhonghu hzxuzhonghu commented Jan 15, 2019

the failover can be implemented in Envoy by adding local locality endpoints in priotity 0, and failover locality endpoints in priority 1.

actually, whenever we specify locality routing with failover, have to specify outlier detection. because outlier detection will detect when local endpoints have failed and proxy traffic to priority 1.

By default, if no failover or distribute specified, The default failover policy is:
if endpoints in same subZone fails, we failover to all endpoints in same zone. if zone fails, we failover to all endpoints in same region. if region fails, we look at the destination rule to see if there is a specific failover region. If no failover region is specified, then we failover to all endpoints in all regions.

@googlebot googlebot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Jan 15, 2019
@costinm costinm requested a review from louiscryan January 15, 2019 21:11
Copy link
Member

@rshriram rshriram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks fine by me. I will cleanup the language in a later PR. @louiscryan thoughts? The defaults are such that we failover from one subzone to all subzones in same zone, one zone to all zones in same region, and then to all regions.. This is the typical expected behavior for most setups. The explicit failover to a specific region allows people to maintain traffic within the same country if desired.

The load distribution is a different use case that says the user would like to spread the load across different zones/regions. This is not the same as weighted routing as it implies a traffic shift for code changes. locality load distribution is more for spreading load (manual control) to other zones.

// If duplicated settings are present, then the first one will take effect.
repeated LocalityWeightSetting locality_weight_settings = 3;
// Locality load balancing settings.
message LocalitySetting{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can we move this entire thing into mesh/v1alpha1/config.proto

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You mean a mesh-wide config instead of per service?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. mesh wide.. Its because this API is very new and we dont know what customers want. So lets keep it in mesh/v1alpha1/config.proto -- so that we can change it if need be

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 lets move this to mesh config

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

// localitySettings:
// failover:
// - from: region1
// to: region2
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is it possible to configure failover after X number of retries?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, this is lb stuff, not related to retry. On converse, retry may base on lb policy to choose which new host to try.
(https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/load_balancing/priority), plan to implement this by setting different priorities.

@linsun linsun requested a review from frankbu January 18, 2019 19:55
Copy link
Member

@rshriram rshriram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@istio-testing
Copy link
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu, rshriram

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@istio-testing istio-testing merged commit 6286ff0 into istio:release-1.1 Jan 21, 2019
@louiscryan
Copy link
Contributor

Updated

@hzxuzhonghu hzxuzhonghu deleted the locality-lb-api branch January 24, 2019 01:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. review/done

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants