contrib: implement Peak EWMA load balancing policy #40653

rroblak · 2025-08-08T23:56:16Z

Commit Message

Adds Peak EWMA (Exponentially Weighted Moving Average) load balancing policy that uses Power of Two Choices algorithm with real-time RTT measurements for latency-aware request routing.

Key components:

Load balancer: envoy.load_balancing_policies.peak_ewma
HTTP filter: envoy.filters.http.peak_ewma for RTT measurement
Configuration: decay_time, aggregation_interval, max_samples_per_host, default_rtt, penalty_value

Implementation uses lock-free atomic ring buffers for RTT sample collection and host-attached storage pattern. Draws from Finagle's Peak EWMA algorithm while avoiding locks, and patterns after Envoy's existing client-side WRR load balancing implementation for main/worker thread coordination.

Fixes #20907

Additional Description

This PR implements a new contrib load balancing policy based on the Peak EWMA (Exponentially Weighted Moving Average) algorithm, which provides latency-aware request routing using real-time RTT measurements.

This addresses the feature request in #20907 for a Peak EWMA load balancer implementation.

Performance Validation

Benchmark results demonstrate Peak EWMA's effectiveness at avoiding slow servers.

Test setup: 10 clients, 10 upstream servers (1 server 10x slower than others):

Algorithm	Average	P50	P75	P90	P95	P99	Min	Max	Std Dev
round_robin	19.08	8.46	10.01	101.79	103.39	105.58	5.79	108.02	29.8
least_request	11.00	8.92	10.71	13.10	15.38	103.00	5.98	109.35	11.72
random	18.66	8.47	10.64	65.51	103.28	106.18	5.95	112.42	28.67
peak_ewma	10.16	9.66	11.13	13.29	14.52	18.56	6.31	26.16	2.45

Peak EWMA demonstrates a dramatically lower tail latency than existing Envoy load balancing algos.

Risk Level

Medium

Testing

Unit Tests: Comprehensive coverage for all components (Cost, Observability, HostData, Config)
Integration Tests: End-to-end load balancing behavior with latency simulation
HTTP Filter Tests: RTT measurement and sample recording functionality
Manual Testing: Verified 100% traffic routing to fast servers vs slow servers

Docs Changes

Added comprehensive API documentation in docs/root/api-v3/config/contrib/load_balancing_policies/peak_ewma/peak_ewma.rst
Documented all 5 config parameters with detailed explanations and examples
Added statistics section following standard Envoy format
Proto files include extensive inline documentation

Release Notes

Added to contrib extensions metadata. This is a new contrib extension so requires no changes to main release notes.

Platform Specific Features

N/A

Runtime Guard

N/A - This is a new contrib extension that must be explicitly configured

Issues

Fixes #20907

Deprecated

N/A

API Changes

This adds new contrib API surfaces:

envoy.extensions.load_balancing_policies.peak_ewma.v3alpha.PeakEwma - Load balancer configuration
envoy.extensions.filters.http.peak_ewma.v3alpha.PeakEwmaConfig - HTTP filter configuration

Both are marked as work_in_progress following contrib extension patterns. No changes to existing APIs.

repokitteh-read-only · 2025-08-08T23:56:22Z

Hi @rroblak, welcome and thank you for your contribution.

We will try to review your Pull Request as quickly as possible.

In the meantime, please take a look at the contribution guidelines if you have not done so already.

🐱

Caused by: #40653 was opened by rroblak.

see: more, trace.

jukie · 2025-08-09T05:24:05Z

This looks really good, thanks for working on this! Will it be possible to support things like localityLbConfig/zoneAwareLbConfig or slow start mode? I'm not suggesting that needs to be included here (also not a maintainer) but I'm curious if any of the core logic here would restrict that.

frittentheke · 2025-08-18T14:04:26Z

This looks really good, thanks for working on this! Will it be possible to support things like localityLbConfig/zoneAwareLbConfig or slow start mode? I'm not suggesting that needs to be included here (also not a maintainer) but I'm curious if any of the core logic here would restrict that.

I was just about to ask about this aspect as well ... also see my comment in the istio/istio#35102 (comment).

With Peak EWMA being about maintaining a low latency, crossing or not crossing a zone barrier comes into question.
Even though potentially more expensive, there could be a trade-off to make (or not)

rroblak · 2025-08-19T22:20:40Z

Thanks for the great questions, @jukie and @frittentheke!

localityLbConfig/zoneAwareLbConfig

This implementation should be compatible with localityLbConfig/zoneAwareLbConfig. They could act as a pre-filter to select a host set, similar to how the current implementation uses host health filtering to only consider healthy hosts. Then the Peak EWMA policy would P2C on that subset to choose the fastest host.

That said, I haven't looked at the object/data model for localityLbConfig/zoneAwareLbConfig so I'd need to dig through to be 100% sure.

Needless to say, however, locality/zone-awareness highlights one of the strengths— and elegance— of Peak EWMA in comparison to load balancing algorithms that don't consider request latency as an input: given an undifferentiated pool of upstream hosts Peak EWMA will dynamically weight traffic toward the upstream hosts with the lowest RTT, which is in practice is the local zone first, followed by increasingly distant zones.

In my experience running this the past few years across data centers, Peak EWMA simplifies configuration and also mitigates partial-failure issues where an upstream host's RTT increases substantially but not enough to cause the host to be marked unhealthy. In this scenario, Peak EWMA will significantly reduce traffic to the affected upstream host (even if it's in the local zone), whereas fixed configurations will continue to send requests to such a host.

Slow Start

Regarding slow start, in a way it's already implemented via the default_rtt and penalty_value parameters. Those ensure that new hosts only receive a trickle of requests until more data is gathered on their respective request latencies. This again highlights Peak EWMA's elegance in contrast to a fixed slow start config.

If we wanted to incorporate the common LB config slow start params we'd need think a bit about how to do that since they would overlap with the default_rtt and penalty_value params I defined. It's likely possible, but not immediately obvious to me how to do it.

Hope that helps! Let me know your thoughts.

tonya11en · 2025-08-25T17:26:48Z

@rroblak CI won't run because of the DCO check failing. You can follow the instructions here to fix it and let the baseline tests run.

tonya11en · 2025-08-25T17:26:53Z

/wait

Adds Peak EWMA (Exponentially Weighted Moving Average) load balancing policy that uses Power of Two Choices algorithm with real-time RTT measurements for latency-aware request routing. Key components: - Load balancer: envoy.load_balancing_policies.peak_ewma - HTTP filter: envoy.filters.http.peak_ewma for RTT measurement - Configuration: decay_time, aggregation_interval, max_samples_per_host, default_rtt, penalty_value Implementation uses lock-free atomic ring buffers for RTT sample collection and host-attached storage pattern. Draws from Finagle's Peak EWMA algorithm while avoiding locks, and patterns after Envoy's existing client-side WRR load balancing implementation for main/worker thread coordination. Fixes envoyproxy#20907 Signed-off-by: Ryan Oblak <[email protected]>

nezdolik · 2025-09-26T10:09:17Z

friendly ping @tonya11en

tonya11en · 2025-09-30T05:24:37Z

I'm waiting for an end-user to chime in on the original issue before reviewing. We need an end-user sponsor that is willing to use this.

KBaichoo · 2025-09-30T16:51:16Z

/wait

Pending an end-user of this extension

tonya11en · 2025-10-01T23:01:01Z

Alright, we have an end-user (#20907 (comment)). I'll start combing through this, just give me until EOW if you don't mind.

tonya11en

This is an enormous PR, so it'll take a couple passes for me to parse all of it. I made a few comments.

tonya11en · 2025-10-06T02:31:09Z

contrib/peak_ewma/filters/http/source/peak_ewma_filter.cc

+        // RTT sample recorded successfully
+      }
+    } else {
+      // Host missing Peak EWMA data - should not happen after initialization


You should probably detect whether it's happening after initialization and at least emit a warning log. Also, explain the circumstances in which this would happen.

tonya11en · 2025-10-06T02:31:24Z

contrib/peak_ewma/filters/http/source/peak_ewma_filter.cc

+            *upstream_timing.first_upstream_rx_byte_received_ -
+            *upstream_timing.first_upstream_tx_byte_sent_);
+
+        // Record RTT sample in host-attached atomic storage


Terminate your comments with a period, please.

tonya11en · 2025-10-06T02:46:28Z

contrib/peak_ewma/load_balancing_policies/source/cost.cc

+namespace LoadBalancingPolicies {
+namespace PeakEwma {
+
+double Cost::compute(double rtt_ewma_ms, double active_requests, double default_rtt_ms) const {


Assert the params are non-negative.

tonya11en · 2025-10-06T03:01:48Z

contrib/peak_ewma/load_balancing_policies/source/observability.cc

+    : cost_stat_(scope.gaugeFromString("peak_ewma." + host->address()->asString() + ".cost",
+                                       Stats::Gauge::ImportMode::NeverImport)),
+      ewma_rtt_stat_(
+          scope.gaugeFromString("peak_ewma." + host->address()->asString() + ".ewma_rtt_ms",
+                                Stats::Gauge::ImportMode::NeverImport)),
+      active_requests_stat_(
+          scope.gaugeFromString("peak_ewma." + host->address()->asString() + ".active_requests",
+                                Stats::Gauge::ImportMode::NeverImport)),


Just FYI, this will cause a cardinality explosion by including the host addresses in the metric name. If folks use this LB policy without knowing this, it'll potentially bring down their TSDB or cause them to run out of quota with their metrics vendor.

However, this is going into contrib so I'm not going to hold the PR up over it. If you're ok with this, then just be sure to mention it in the docs or make it opt-in.

tonya11en · 2025-10-06T03:02:54Z

contrib/peak_ewma/load_balancing_policies/source/observability.cc

+void Observability::report(
+    const absl::flat_hash_map<Upstream::HostConstSharedPtr,
+                              std::unique_ptr<GlobalHostStats>>& /* all_host_stats */) {
+  // Stats are published during aggregation - this is a placeholder for consistency
+}


Can you explain why you need this?

Good catch. That was vestigial from previous development iterations. I removed it and refactored the code that is used into peak_ewma_lb.cc.

tonya11en · 2025-10-06T04:10:06Z

contrib/peak_ewma/load_balancing_policies/source/peak_ewma_lb.cc

+    // Write index wrapped around
+    return (max_samples - last_processed) + current_write;


Is this case for if the current_write variable overflows?

tonya11en · 2025-10-06T04:11:19Z

contrib/peak_ewma/load_balancing_policies/source/peak_ewma_lb.cc

+
+  // Process all new samples in chronological order
+  size_t processed_index = last_processed;
+  for (size_t i = 0; i < num_new_samples; ++i) {


How do you handle num_new_samples being larger than the max samples? I'm either missing something or this loop will process some samples more than once under high load.

tonya11en · 2025-10-06T04:12:03Z

CODEOWNERS

+/contrib/peak_ewma/filters/http/ @rroblak @UNOWNED
+/contrib/peak_ewma/load_balancing_policies/ @rroblak @UNOWNED


You may as well just own /contrib/peak_ewma.

tonya11en · 2025-10-06T04:14:50Z

/wait

tonya11en · 2025-10-20T17:36:54Z

CI can't run until the DCO check succeeds. I'll approve once it's all green.

tonya11en · 2025-10-20T17:37:00Z

/wait

Signed-off-by: Ryan Oblak <[email protected]>

…rver avoidance Signed-off-by: Ryan Oblak <[email protected]>

Signed-off-by: Ryan Oblak <[email protected]>

rroblak mentioned this pull request Aug 8, 2025

Peak EWMA load balancing #20907

Closed

rroblak force-pushed the rroblak/peak-ewma-squashed branch from 07e1f68 to 3e5e9d4 Compare August 9, 2025 05:32

mathetake requested a review from tonya11en August 11, 2025 19:59

mathetake assigned tonya11en Aug 11, 2025

repokitteh-read-only bot added the waiting label Aug 25, 2025

rroblak force-pushed the rroblak/peak-ewma-squashed branch 9 times, most recently from 2a81114 to f53dd24 Compare September 9, 2025 19:07

repokitteh-read-only bot removed the waiting label Sep 9, 2025

rroblak force-pushed the rroblak/peak-ewma-squashed branch from f53dd24 to 95b56a5 Compare September 9, 2025 19:32

kahirokunn mentioned this pull request Sep 19, 2025

Support Peak EWMA load balancing policy in Envoy Gateway envoyproxy/gateway#7011

Open

repokitteh-read-only bot added the waiting label Sep 30, 2025

tonya11en added the contrib PRs for contrig label Oct 6, 2025

tonya11en reviewed Oct 6, 2025

View reviewed changes

repokitteh-read-only bot removed the waiting label Oct 17, 2025

repokitteh-read-only bot added the waiting label Oct 20, 2025

rroblak added 7 commits October 20, 2025 18:15

style: terminate comments with periods

8a85ebc

Signed-off-by: Ryan Oblak <[email protected]>

refactor: remove unnecessary Observability abstraction layer

e2ca3ed

Signed-off-by: Ryan Oblak <[email protected]>

test: add Peak EWMA HTTP filter to integration test for < 10% slow se…

fa56a32

…rver avoidance Signed-off-by: Ryan Oblak <[email protected]>

style: use ProtobufWkt::Empty instead of Protobuf::Empty in config test

b0cefd7

Signed-off-by: Ryan Oblak <[email protected]>

fix: use fully-qualified namespaces to avoid ambiguity

40c6294

Signed-off-by: Ryan Oblak <[email protected]>

fix: namespace typo and formatting

c39f03c

Signed-off-by: Ryan Oblak <[email protected]>

fix: resolve namespace ambiguity and remove invalid config test

0378af9

Signed-off-by: Ryan Oblak <[email protected]>

rroblak force-pushed the rroblak/peak-ewma-squashed branch from 3814720 to 0378af9 Compare October 20, 2025 18:16

repokitteh-read-only bot removed the waiting label Oct 20, 2025

tonya11en enabled auto-merge (squash) October 24, 2025 19:29

tonya11en approved these changes Oct 24, 2025

View reviewed changes

tonya11en merged commit 4364cb5 into envoyproxy:main Oct 24, 2025
25 checks passed

This was referenced Nov 18, 2025

Introduce prequal algorithm into envoy #42091

Open

Add Peak EWMA load balancer (contrib) istio/proxy#6690

Open

		// Write index wrapped around
		return (max_samples - last_processed) + current_write;

		/contrib/peak_ewma/filters/http/ @rroblak @UNOWNED
		/contrib/peak_ewma/load_balancing_policies/ @rroblak @UNOWNED

contrib: implement Peak EWMA load balancing policy #40653

contrib: implement Peak EWMA load balancing policy #40653

Uh oh!

Conversation

rroblak commented Aug 8, 2025

Commit Message

Additional Description

Performance Validation

Risk Level

Testing

Docs Changes

Release Notes

Platform Specific Features

Runtime Guard

Issues

Deprecated

API Changes

Uh oh!

repokitteh-read-only bot commented Aug 8, 2025

Uh oh!

jukie commented Aug 9, 2025

Uh oh!

frittentheke commented Aug 18, 2025

Uh oh!

rroblak commented Aug 19, 2025

localityLbConfig/zoneAwareLbConfig

Slow Start

Uh oh!

tonya11en commented Aug 25, 2025

Uh oh!

tonya11en commented Aug 25, 2025

Uh oh!

nezdolik commented Sep 26, 2025

Uh oh!

tonya11en commented Sep 30, 2025

Uh oh!

KBaichoo commented Sep 30, 2025

Uh oh!

tonya11en commented Oct 1, 2025

Uh oh!

tonya11en left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

rroblak Oct 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tonya11en commented Oct 6, 2025

Uh oh!

tonya11en commented Oct 20, 2025

Uh oh!

tonya11en commented Oct 20, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

rroblak Oct 18, 2025 •

edited

Loading