Skip to content

Add per service locality weight setting#726

Merged
rshriram merged 6 commits intoistio:release-1.1from
hzxuzhonghu:locality-weight
Dec 29, 2018
Merged

Add per service locality weight setting#726
rshriram merged 6 commits intoistio:release-1.1from
hzxuzhonghu:locality-weight

Conversation

@hzxuzhonghu
Copy link
Copy Markdown
Member

@hzxuzhonghu hzxuzhonghu commented Dec 6, 2018

This is to support envoy locality-weighted-load-balancing

@googlebot googlebot added the cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA. label Dec 6, 2018
@hzxuzhonghu
Copy link
Copy Markdown
Member Author

/assign @rshriram

Copy link
Copy Markdown
Member

@rshriram rshriram left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@istio-testing
Copy link
Copy Markdown
Collaborator

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: hzxuzhonghu, rshriram

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@rshriram
Copy link
Copy Markdown
Member

@hzxuzhonghu sorry I missed this. This looks good.

@rshriram rshriram merged commit 08a19da into istio:release-1.1 Dec 29, 2018
@costinm
Copy link
Copy Markdown
Contributor

costinm commented Dec 29, 2018

Some comments on how this will be used ?
My understanding is that the weights come from the endpoints - based on load or other workload-specific info, and is just enabled in the cluster config.

Can you explain a bit what kind of envoy config will be generated, and how can this even be implemented ? It's clear we can't generate the 80% or whatever split - it's based on load info (which we may be able to get), endpoint health, etc.

Is there a doc about this ( on how it impact Istio ) ?

And please, if an API change has not be approved by the WG/TOC please add some doc
making it clear this is a proposed/experimental API and not covered by 'beta' guarantee. It'll be
pretty hard to separate otherwise.

@hzxuzhonghu
Copy link
Copy Markdown
Member Author

@hzxuzhonghu hzxuzhonghu deleted the locality-weight branch December 29, 2018 07:13
@costinm
Copy link
Copy Markdown
Contributor

costinm commented Dec 29, 2018 via email

@costinm
Copy link
Copy Markdown
Contributor

costinm commented Dec 29, 2018 via email

@rshriram
Copy link
Copy Markdown
Member

this requirement came from folks at Intuit and Atlassian. The ability to specify amount of load to each region or zero load to a region.

Weights are not auto adjusted based on number of endpoints per region or endpoint load. Its purely human driven with some company level policies that determine how much traffic needs to be spilled over to the remote region (0 or 100%). In other words, people wanted control over how traffic gets spilled to the remote regions.

@costinm
Copy link
Copy Markdown
Contributor

costinm commented Dec 31, 2018 via email

@hzxuzhonghu
Copy link
Copy Markdown
Member Author

Locality weighted load balancing is configured by setting locality_weighted_lb_config in the cluster configuration and providing weights in LocalityLbEndpoints via load_balancing_weight.

The implementation is simple,

How is it different than existing split, with labels for zone/region ?

Actually, the existing split does not take effect I think. We dont set config as envoy expected.

@costinm
Copy link
Copy Markdown
Contributor

costinm commented Jan 2, 2019 via email

@hzxuzhonghu
Copy link
Copy Markdown
Member Author

If the workloads are labeled to reflect zone/etc - why wouldn't
the destination rule split satisfy the requirement ? I assume it's because
you want different splits by source - is this something we
want for traffic split/DestinationRule in general ?

That's right. As you said DestinationRule can not achieve sourced routing. We may have workloads reside in multi region/zones, and access workloads in other multi region/zones. We need to control traffic based on both region/zone and the workloads number within it.

Some details ?

Envoy docs requires setting Cluster.CommonLbConfig.LocalityWeightedLbConfig and combined endpoint.LocalityLbEndpoints.load_balancing_weight . currently LocalityWeightedLbConfig is more like a flag(https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/cds.proto#cluster-commonlbconfig-localityweightedlbconfig), and load_balancing_weight has already been set in previous prs.

@costinm
Copy link
Copy Markdown
Contributor

costinm commented Jan 2, 2019 via email

@rshriram
Copy link
Copy Markdown
Member

rshriram commented Jan 2, 2019

You are overthinking the problem. Envoy has all the knobs required to do differential load balancing between endpoints in the same cluster. We have had locality-aware routing (though non functional) for a while. All this does is specify what portion needs to be local vs what should be shed based on user specified parameters. This has no dependency on knowing what endpoints exist in the remote cluster (i.e. pilot talking to remote api servers). Nor does it have any dependency on load computation based on the endpoints.

Forcing users to create different versions of the same binary, just because they are in different regions is not a good strategy (it gets complicated when you want to do real version routing).

Besides, the main goal of doing this (the way its done) is so that we define a top level destinationRule (*.svc.cluster.local) that specifies how traffic should be distributed balanced across regions. Everything else will inherit it -- this is how destination rules work today.

The reason I didn't insist on a doc is because the impact is very scoped to just the specific use case we are tackling (for intuit/Atlassian etc. who wanted some manual override). IF not specified, traffic distribution will be same as what exists today. There is nothing here that prevents you from using the Google internal EDS load assignment servers that compute endpoint load across clusters in different regions and assign weights. This is literally a basic manual override, and actually implementing the AZ aware load balancing thing that we have been claiming for a while.

@linsun
Copy link
Copy Markdown
Member

linsun commented Jan 2, 2019

how does envoy side car know the region and zone info within istio env? I don't recall setting up instruction for this.

also, does this new traffic policy allow user to configure this most common user case @costinm outlined earlier? I can set to 100% to local region/zone but how do I configure the fallback?

all configs I've seen want as much as possible local, then fallback to zone with extra capacity in order of latency

@rshriram
Copy link
Copy Markdown
Member

rshriram commented Jan 2, 2019

Service entry has locality field. And you can add region/az annotation to k8s nodes. We have been parsing these values for two years. Not using them though. Basically no user config. Just make sure cluster nodes are annotated per standard kubernetes docs.

In terms of the fallback, the functionality exists in a convoluted way in Envoy and works only if active health checking or outlier detection is enabled or retry with priority levels are enabled. This needs more work in Envoy. Alternatively you can write your own automation to change the values in the destination rule during such incidents.

@linsun
Copy link
Copy Markdown
Member

linsun commented Jan 3, 2019

thank you @rshriram for the clarification!!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla: yes Set by the Google CLA bot to indicate the author of a PR has signed the Google CLA.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants