Add per service locality weight setting#726
Conversation
|
/assign @rshriram |
|
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: hzxuzhonghu, rshriram The full list of commands accepted by this bot can be found here. The pull request process is described here DetailsNeeds approval from an approver in each of these files:
Approvers can indicate their approval by writing |
|
@hzxuzhonghu sorry I missed this. This looks good. |
|
Some comments on how this will be used ? Can you explain a bit what kind of envoy config will be generated, and how can this even be implemented ? It's clear we can't generate the 80% or whatever split - it's based on load info (which we may be able to get), endpoint health, etc. Is there a doc about this ( on how it impact Istio ) ? And please, if an API change has not be approved by the WG/TOC please add some doc |
|
That seems to be an empty proto, that indicates to envoy to use the load
info from EDS.
What I'm trying to figure out is how does this DestinationRule translate
into an envoy config that can
provide what the API is claiming. Is it going to affect the EDS results ?
But how ?
And secondary - what is the use case for user to specify "80% to local
zone, 20% to remote zone".
Never seen this use case - all configs I've seen want as much as possible
local, then fallback to
zone with extra capacity in order of latency.
…On Fri, Dec 28, 2018 at 10:52 PM Zhonghu Xu ***@***.***> wrote:
@costinm <https://github.com/costinm>
https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/load_balancing/locality_weight
It will generate Cluster.CommonLbConfig
<https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/cds.proto#cluster-commonlbconfig>
for Envoy.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#726 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFI6rio-IkyL87W3kI_J9JuekCDkANjks5u9xEUgaJpZM4ZGK1f>
.
|
|
To be clear: we really want to implement locality based LB, and many thanks
for starting - I just
want to understand what is the plan, and if this is the right API.
…On Sat, Dec 29, 2018 at 9:04 AM Costin Manolache ***@***.***> wrote:
That seems to be an empty proto, that indicates to envoy to use the load
info from EDS.
What I'm trying to figure out is how does this DestinationRule translate
into an envoy config that can
provide what the API is claiming. Is it going to affect the EDS results ?
But how ?
And secondary - what is the use case for user to specify "80% to local
zone, 20% to remote zone".
Never seen this use case - all configs I've seen want as much as possible
local, then fallback to
zone with extra capacity in order of latency.
On Fri, Dec 28, 2018 at 10:52 PM Zhonghu Xu ***@***.***>
wrote:
> @costinm <https://github.com/costinm>
> https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/load_balancing/locality_weight
>
> It will generate Cluster.CommonLbConfig
> <https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/cds.proto#cluster-commonlbconfig>
> for Envoy.
>
> —
> You are receiving this because you were mentioned.
> Reply to this email directly, view it on GitHub
> <#726 (comment)>, or mute
> the thread
> <https://github.com/notifications/unsubscribe-auth/AAFI6rio-IkyL87W3kI_J9JuekCDkANjks5u9xEUgaJpZM4ZGK1f>
> .
>
|
|
this requirement came from folks at Intuit and Atlassian. The ability to specify amount of load to each region or zero load to a region. Weights are not auto adjusted based on number of endpoints per region or endpoint load. Its purely human driven with some company level policies that determine how much traffic needs to be spilled over to the remote region (0 or 100%). In other words, people wanted control over how traffic gets spilled to the remote regions. |
|
How is it different than existing split, with labels for zone/region ?
The envoy feature seems to be intended for real load balancing, not exact
split.
And how will this api interact with the real lb ?
Design doc or discussions before api change would help, in particular for
things with broad implications...
…On Mon, Dec 31, 2018, 12:27 Shriram Rajagopalan ***@***.*** wrote:
this requirement came from folks at Intuit and Atlassian. The ability to
specify amount of load to each region or zero load to a region.
Weights are not auto adjusted based on number of endpoints per region or
endpoint load. Its purely human driven with some company level policies
that determine how much traffic needs to be spilled over to the remote
region (0 or 100%). In other words, people wanted control over how traffic
gets spilled to the remote regions.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#726 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFI6py0rlnrGVYUX77drDJ51dWYUOo3ks5u-nMXgaJpZM4ZGK1f>
.
|
The implementation is simple,
Actually, the existing split does not take effect I think. We dont set config as envoy expected. |
|
On Tue, Jan 1, 2019 at 5:40 PM Zhonghu Xu ***@***.***> wrote:
Locality weighted load balancing is configured by setting
locality_weighted_lb_config in the cluster configuration and providing
weights in LocalityLbEndpoints via load_balancing_weight.
The implementation is simple,
Some details ? I don't see any simple option based on the envoy docs on
locality_weighted_lb_config that could achieve what the API seems to
promise.
How is it different than existing split, with labels for zone/region ?
Actually, the existing split does not take effect I think. We dont set
config as envoy expected.
I meant: the current traffic split we do for subsets ( based on labels,
etc). If the workloads are labeled to reflect zone/etc - why wouldn't
the destination rule split satisfy the requirement ? I assume it's because
you want different splits by source - is this something we
want for traffic split/DestinationRule in general ?
But my primary concern is to not confuse users - we still want real
locality-based load balancing, taking into account load, endpoints, etc.
And the envoy option doesn't seem to match the percent based split in the
API you are proposing.
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#726 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFI6v17LKxp0-KQpr1HTRflJyVEdtRGks5u_A4NgaJpZM4ZGK1f>
.
|
That's right. As you said DestinationRule can not achieve sourced routing. We may have workloads reside in multi region/zones, and access workloads in other multi region/zones. We need to control traffic based on both region/zone and the workloads number within it.
Envoy docs requires setting |
|
On Tue, Jan 1, 2019 at 10:20 PM Zhonghu Xu ***@***.***> wrote:
If the workloads are labeled to reflect zone/etc - why wouldn't
the destination rule split satisfy the requirement ? I assume it's because
you want different splits by source - is this something we
want for traffic split/DestinationRule in general ?
That's right. As you said DestinationRule can not achieve sourced routing.
We may have workloads reside in multi region/zones, and access workloads in
other multi region/zones. We need to control traffic based on both
region/zone and the workloads number within it.
Is there a doc describing the use cases, requirements - and how the
implementation will work ?
I'm as confused as before - and I suspect other devs are also not familiar
with this, and I don't remember any discussion.
Maybe a cleaner solution is to extend DestinationRule to allow source-based
config. What you described so far doesn't sound like
load balancing, but traffic split. In general "load balancing" implies
"load" will be a factor.
Some details ?
Envoy docs requires setting
Cluster.CommonLbConfig.LocalityWeightedLbConfig and combined
endpoint.LocalityLbEndpoints.load_balancing_weight . currently
LocalityWeightedLbConfig is more like a flag(
https://www.envoyproxy.io/docs/envoy/latest/api-v2/api/v2/cds.proto#cluster-commonlbconfig-localityweightedlbconfig),
and load_balancing_weight has already been set in previous prs.
Without a document or discussion on the list or in WG - I don't think many
people know about them. We need to improve the review process to make sure
all PRs have clear indication if it is a bug fix or adds a new feature -
and in the later case link to some doc. We can't require all developers to
review all PRs.
Even if I saw the PRs - without comments in the PR it's hard to understand
what the final picture will be and that they were related to this feature.
I've
seen a bunch of PRs related to zero vpn, using load weight to adjust the
gateway - but they can't be used in the general case, and we can't have a
feature
like this require zvpn and gateways. As I mentioned, 'load balancing' is a
very complicated subject and the API in this PR doesn't fit any pattern or
use I've
seen - so docs please...
… —
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#726 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AAFI6qsn0cMU_Y7EmIQph4mfl_UoYizEks5u_E-bgaJpZM4ZGK1f>
.
|
|
You are overthinking the problem. Envoy has all the knobs required to do differential load balancing between endpoints in the same cluster. We have had locality-aware routing (though non functional) for a while. All this does is specify what portion needs to be local vs what should be shed based on user specified parameters. This has no dependency on knowing what endpoints exist in the remote cluster (i.e. pilot talking to remote api servers). Nor does it have any dependency on load computation based on the endpoints. Forcing users to create different versions of the same binary, just because they are in different regions is not a good strategy (it gets complicated when you want to do real version routing). Besides, the main goal of doing this (the way its done) is so that we define a top level destinationRule (*.svc.cluster.local) that specifies how traffic should be distributed balanced across regions. Everything else will inherit it -- this is how destination rules work today. The reason I didn't insist on a doc is because the impact is very scoped to just the specific use case we are tackling (for intuit/Atlassian etc. who wanted some manual override). IF not specified, traffic distribution will be same as what exists today. There is nothing here that prevents you from using the Google internal EDS load assignment servers that compute endpoint load across clusters in different regions and assign weights. This is literally a basic manual override, and actually implementing the AZ aware load balancing thing that we have been claiming for a while. |
|
how does envoy side car know the region and zone info within istio env? I don't recall setting up instruction for this. also, does this new traffic policy allow user to configure this most common user case @costinm outlined earlier? I can set to 100% to local region/zone but how do I configure the fallback?
|
|
Service entry has locality field. And you can add region/az annotation to k8s nodes. We have been parsing these values for two years. Not using them though. Basically no user config. Just make sure cluster nodes are annotated per standard kubernetes docs. In terms of the fallback, the functionality exists in a convoluted way in Envoy and works only if active health checking or outlier detection is enabled or retry with priority levels are enabled. This needs more work in Envoy. Alternatively you can write your own automation to change the values in the destination rule during such incidents. |
|
thank you @rshriram for the clarification!! |
This is to support envoy locality-weighted-load-balancing