upstream: allow excluding hosts from lb calculations until initial health check by snowp · Pull Request #6794 · envoyproxy/envoy

snowp · 2019-05-03T06:47:40Z

This adds an option to allow hosts to be excluded in lb calculations until they have been health checked
for the first time. This will make it possible to scale up the number of hosts quickly (ie large increase
relative to current host set size) without triggering panic mode/spillover (as long as the initial health check
is succeeds).

While these hosts are excluded from the lb calculations, they are still eligible for routing when panic
mode is triggered.

Risk Level: Medium, optional feature but touches quite a bit of code
Testing: UTs
Docs Changes: Proto docs
Release Notes: n/a for now
Fixes #6653

…sult Signed-off-by: Snow Pettersen <snowp@squareup.com>

Signed-off-by: Snow Pettersen <snowp@squareup.com>

snowp · 2019-05-03T06:50:10Z

This isn't quite done (missing quite a few tests) but putting it up now for some early feedback.

The general approach is very similar to how degraded was added: a new set of information both for all hosts and per locality is plumbed through in update hosts so that priority load and locality load calculations can use these to adjust the weight.

cc @lita @mattklein123

Signed-off-by: Snow Pettersen <snowp@squareup.com>

snowp · 2019-05-03T17:40:01Z

source/common/upstream/subset_lb.cc

-                               degraded_hosts, degraded_hosts_per_locality),
-                           determineLocalityWeights(*hosts_per_locality), filtered_added,
-                           filtered_removed);
+  // TODO(snowp): Right now we just pass hosts->size(), really this need to be filtering down the


Turns out just passing the count around won't work since the subset lb will need a list of warmed hosts to be able to tell how many of them belong to the subset. I'll update this to use another list of hosts

Signed-off-by: Snow Pettersen <snowp@squareup.com>

snowp · 2019-05-03T23:14:46Z

Alright this is now using a new host vector + new host per locality in order to be able to apply the weight adjustment to both locality weighting as well as panic/spillover.

There's an unfortunate complexity here which requires O(hosts) space for each of these weight adjustments (warmed hosts, degraded hosts): not only do we need to know how many hosts are warmed, we also need to keep track of which hosts are in this state for each update so that the subset lb can subset this list on the worker threads.

Open for other ideas how to accomplish this, but I'm not sure how else to do it without large refactors (e.g. if the subsetting happened on the main thread we only need to retain counts, not lists of what hosts are warmed).

If these increases in memory usage for the cluster is concerning we could do some work to have both warmed and degraded lists use static empty singletons instead, which should help a bit.

This is still missing quite a few tests; I'll add those in if we're happy with this approach.

mattklein123 · 2019-05-03T23:44:06Z

@snowp we want to land this at Lyft sooner rather than later so I will take a look some time this weekend and get back to you with some high level feedback?

snowp · 2019-05-04T00:21:17Z

Works for me, I'll be around this weekend to respond to feedback

Signed-off-by: Snow Pettersen <snowp@squareup.com>

snowp · 2019-05-05T16:33:04Z

Pushed a change that will hopefully address the memory usage: instead of tracking which hosts should be included we track the excluded ones. This should reduce the increase in memory usage per host set to just a few bytes (empty vector, empty hosts per locality) when this feature is not enabled and when all hosts have been health checked.

Signed-off-by: Snow Pettersen <snowp@squareup.com>

mattklein123

Thanks this is great. Some initial comments to get started. Also, WDYT about adding an integration test a la the one I just added in #6813? Hopefully using that as a guide will make it go much faster for you than it did for me. :)

/wait

mattklein123 · 2019-05-06T02:41:51Z

api/envoy/api/v2/cds.proto

+    //
+    // Ignoring a host means that for any load balancing calculations that adjust weights based
+    // on the ratio of eligible hosts and total hosts (priority spillover, locality weighting, etc.)
+    // will exclude these hosts in the denominator.


Envoy will exclude?

mattklein123 · 2019-05-06T02:42:53Z

api/envoy/api/v2/cds.proto

+    // active health checking is also configured.
+    //
+    // Ignoring a host means that for any load balancing calculations that adjust weights based
+    // on the ratio of eligible hosts and total hosts (priority spillover, locality weighting, etc.)


maybe put panic mode in the parenthesis here also?

mattklein123 · 2019-05-06T02:43:09Z

include/envoy/upstream/upstream.h

   */
  virtual void used(bool new_used) PURE;
+
+  virtual bool warmed() const PURE;


nit: doc comments

mattklein123 · 2019-05-06T02:43:25Z

include/envoy/upstream/upstream.h

+  /*
+   * @return all excluded hosts contained in the set at the current time. Excluded hosts should be
+   * ignored when computing load balancing weights, but may overlap with hosts in hosts().
+   * */


nit: extra *

mattklein123 · 2019-05-06T02:45:58Z

source/common/upstream/cluster_manager_impl.cc

      host_set->healthyHostsPerLocality().clone();
  HostsPerLocalityConstSharedPtr degraded_hosts_per_locality_copy =
      host_set->degradedHostsPerLocality().clone();
+  ExcludedHostVectorConstSharedPtr excluded_hosts_copy(


Worth doing the @htuch TODO up above here soon in a follow up? This is continues to grow scarier and scarier. :)

Avoiding the copies here would I think make this function less painful to read and avoid asking for some type of param struct like we have done elsewhere?

Wouldn't you still need to do pull out a shared ptr for each of the values and pass them along to the lambda to ensure that we're using consistent values? If we just pass along the host set I think it's possible for the TLS updates to happen concurrently with a host update on the main thread and result in potentially inconsistent values?

I think this whole thing could be simplified pretty easily by reusing the UpdateHostsParams param struct that's already being used elsewhere to call updateHosts, which would be helped by exposing shared ptrs to the underlying typed arrays. I'll give that a go in this PR and we can decide whether it makes sense to split it out as a different PR

I think this whole thing could be simplified pretty easily by reusing the UpdateHostsParams param struct that's already being used elsewhere to call updateHosts, which would be helped by exposing shared ptrs to the underlying typed arrays. I'll give that a go in this PR and we can decide whether it makes sense to split it out as a different PR

Sorry yeah this is roughly what I meant. I think we can avoid the copies vector and also simplify the copying?

source/common/upstream/health_checker_base_impl.cc

source/common/upstream/upstream_impl.cc

mattklein123 · 2019-05-06T02:51:06Z

source/common/upstream/upstream_impl.cc

-                           std::move(healthy_hosts), std::move(healthy_hosts_per_locality),
-                           std::make_shared<const DegradedHostVector>(),
-                           HostsPerLocalityImpl::empty());
+  // TODO(snowp): Move this function into test/


Just do it now?

mattklein123 · 2019-05-06T02:51:50Z

source/common/upstream/upstream_impl.cc

          healthy_hosts += host_set->healthyHosts().size();
          degraded_hosts += host_set->degradedHosts().size();
        }
+        // TODO(snowp): Stats for excluded hosts?


mattklein123 · 2019-05-06T02:53:00Z

source/common/upstream/upstream_impl.h

  bool used() const override { return used_; }
  void used(bool new_used) override { used_ = new_used; }
+  bool warmed() const override {
+    if (cluster_ != nullptr && cluster_->warmHosts()) {


Why bother checking this vs. just check the flag directly out of curiosity?

As written the flag is always set, even if the config flag is not set. I guess the alternative solution is to have the health checker check the flag to determine whether the flag should be set or not and then just read the flag. I think that sounds a bit cleaner (moves the complexity out Host), so I'll try to give that a go

Signed-off-by: Snow Pettersen <snowp@squareup.com>

snowp · 2019-05-07T12:34:46Z

Current failures are related to memory usage:

[ RUN      ] IpVersions/ClusterMemoryTestRunner.MemoryLargeClusterSizeWithStats/IPv4
test/integration/stats_integration_test.cc:214: Failure
Expected equality of these values:
  m_per_cluster
    Which is: 49957
  49415

mattklein123

Nice, this looks awesome, thanks for the integration test. I would go ahead and bump the stats size test. I think this is an important stat to add. cc @jmarantz

/wait

include/envoy/upstream/upstream.h

mattklein123 · 2019-05-07T19:49:58Z

source/common/upstream/health_checker_base_impl.cc

    // it to healthy. This makes startup faster with a small reduction in overall reliability
    // depending on the HC settings.
    if (first_check_ || ++num_healthy_ == parent_.healthy_threshold_) {
+      host_->healthFlagClear(Host::HealthFlag::PENDING_ACTIVE_HC);


ping on this

jmarantz · 2019-05-07T19:58:02Z

Bumping the memory expectations in the stats integration test LGTM.

Signed-off-by: Snow Pettersen <snowp@squareup.com>

mattklein123

Nice, looks great. One nit.

/wait

mattklein123 · 2019-05-07T22:58:34Z

source/common/upstream/health_checker_base_impl.cc

    }
  }

+  // Clear the pending flag if it is set. By removing this we're marking the host as having been


nit: maybe put this logic into a helper function that can be shared here and below? This would keep the comments and reasoning in one place? WDYT?

Signed-off-by: Snow Pettersen <snowp@squareup.com>

mattklein123

Awesome!

snowp · 2019-05-09T17:25:30Z

FYI @htuch: this PR is similar to the work I did for degraded, just keeping you in the loop

I'll merge this today unless we feel like this needs more reviews?

htuch · 2019-05-10T00:22:32Z

@snowp LGTM, thanks for the heads up. My only comment is I continue to be somewhat amazed by the growth of complexity in our endpoint management; I think a lot of this is inherent due to the cool features we're adding, but I do wonder whether we can make some architectural simplifications. CC @antoniovicente who might be interested in making some contributions in this area of the code.

lita · 2019-05-10T01:53:10Z

Thank you so much, @snowp for this change! We are excited to be using this feature.

* master: (88 commits) upstream: Null-deref on TCP health checker if setsockopt fails (envoyproxy#6793) ci: switch macOS CI to azure pipelines (envoyproxy#6889) os syscalls lib: break apart syscalls used for hot restart (envoyproxy#6880) Kafka codec: precompute request size before serialization, so we do n… (envoyproxy#6862) upstream: move static and strict_dns clusters to dedicated files (envoyproxy#6886) Rollforward of api: Add total_issued_requests to Upstream Locality and Endpoint Stats. (envoyproxy#6692) (envoyproxy#6784) fix explicit constructor in copy-initialization (envoyproxy#6884) stats: use tag iterator rather than constructing the tag-array and searching that. (envoyproxy#6853) common: use unscoped build target in generate_version_linkstamp (envoyproxy#6877) Addendum to envoyproxy#6778 (envoyproxy#6882) ci: add minimum Linux build for Azure Pipelines (envoyproxy#6881) grpc: utilities for inter-converting grpc::ByteBuffer and Buffer::Instance. (envoyproxy#6732) upstream: allow excluding hosts from lb calculations until initial health check (envoyproxy#6794) stats: prevent unused counters from leaking across hot restart (envoyproxy#6850) network filters: add `injectDataToFilterChain(data, end_stream)` method to network filter callbacks (envoyproxy#6750) delete things that snuck back in (envoyproxy#6873) config: scoped rds (2b): support delta APIs in ConfigProvider framework (envoyproxy#6781) string == string! (envoyproxy#6868) config: add mssing imports to delta_subscription_state (envoyproxy#6869) protobuf: add missing default case to enum (envoyproxy#6870) ... Signed-off-by: Michael Puncel <mpuncel@squareup.com>

Snow Pettersen added 5 commits May 2, 2019 21:32

upstream: exclude hosts from lb calculations until first active hc re…

3293441

…sult Signed-off-by: Snow Pettersen <snowp@squareup.com>

fix upstream tests

fd70113

Signed-off-by: Snow Pettersen <snowp@squareup.com>

fix locality test

83f454b

Signed-off-by: Snow Pettersen <snowp@squareup.com>

add test for zero warmed in locality

f7b142f

Signed-off-by: Snow Pettersen <snowp@squareup.com>

fix most of the tests

8d0b343

Signed-off-by: Snow Pettersen <snowp@squareup.com>

Snow Pettersen added 3 commits May 3, 2019 02:53

actually adjust panic based on warmed hosts

75c8a7c

Signed-off-by: Snow Pettersen <snowp@squareup.com>

format + other fixes

d94e116

Signed-off-by: Snow Pettersen <snowp@squareup.com>

fix proto field id + format

821913e

Signed-off-by: Snow Pettersen <snowp@squareup.com>

mattklein123 self-assigned this May 3, 2019

spelling + fix load_balancer_benchmark build

9de4c56

Signed-off-by: Snow Pettersen <snowp@squareup.com>

snowp commented May 3, 2019

View reviewed changes

Snow Pettersen added 2 commits May 3, 2019 17:53

pass whole list of hosts to faclitate subsetting

0bc3d03

Signed-off-by: Snow Pettersen <snowp@squareup.com>

fix more build failures

1cb40f5

Signed-off-by: Snow Pettersen <snowp@squareup.com>

Snow Pettersen added 3 commits May 4, 2019 12:09

fix filter bug

a4a4f10

Signed-off-by: Snow Pettersen <snowp@squareup.com>

fix test failures

ed4eaa7

Signed-off-by: Snow Pettersen <snowp@squareup.com>

track excluded hosts instead of warmed hosts to reduce memory usage

d881b27

Signed-off-by: Snow Pettersen <snowp@squareup.com>

fix cm tests

ce7d5b4

Signed-off-by: Snow Pettersen <snowp@squareup.com>

mattklein123 requested changes May 6, 2019

View reviewed changes

repokitteh-read-only bot added the waiting label May 6, 2019

expose const shared ptr, use UpdateHostsParams

f017af1

Signed-off-by: Snow Pettersen <snowp@squareup.com>

repokitteh-read-only bot removed the waiting label May 6, 2019

Snow Pettersen added 3 commits May 6, 2019 12:43

remove warmed() function, check flag directly

5e688e2

Signed-off-by: Snow Pettersen <snowp@squareup.com>

Merge remote-tracking branch 'envoy/master' into warm-new-hosts

6b962fd

Signed-off-by: Snow Pettersen <snowp@squareup.com>

add integration test + stats

8c33216

Signed-off-by: Snow Pettersen <snowp@squareup.com>

mattklein123 added the waiting label May 6, 2019

add UTs for panic

ab918c3

Signed-off-by: Snow Pettersen <snowp@squareup.com>

repokitteh-read-only bot removed the waiting label May 6, 2019

Snow Pettersen added 3 commits May 6, 2019 17:08

move params helpers to test + other nits

7950e0b

Signed-off-by: Snow Pettersen <snowp@squareup.com>

add UT for partitionHosts

510a7a9

Signed-off-by: Snow Pettersen <snowp@squareup.com>

add version notes

c89bb7b

Signed-off-by: Snow Pettersen <snowp@squareup.com>

mattklein123 added the waiting label May 6, 2019

fix more updateHostsParams usages

8a9153a

Signed-off-by: Snow Pettersen <snowp@squareup.com>

repokitteh-read-only bot removed the waiting label May 6, 2019

mattklein123 added the waiting label May 6, 2019

Snow Pettersen added 2 commits May 6, 2019 20:12

fix format failures

820cd35

Signed-off-by: Snow Pettersen <snowp@squareup.com>

remove redundant this

b69d22e

Signed-off-by: Snow Pettersen <snowp@squareup.com>

repokitteh-read-only bot removed the waiting label May 7, 2019

mattklein123 requested changes May 7, 2019

View reviewed changes

repokitteh-read-only bot added the waiting label May 7, 2019

Snow Pettersen added 3 commits May 7, 2019 16:57

normalize handling of pending flag, add TODO

b79ef40

Signed-off-by: Snow Pettersen <snowp@squareup.com>

increase stats limit

1a931fb

Signed-off-by: Snow Pettersen <snowp@squareup.com>

Merge remote-tracking branch 'envoy/master' into warm-new-hosts

0207652

Signed-off-by: Snow Pettersen <snowp@squareup.com>

repokitteh-read-only bot removed the waiting label May 7, 2019

mattklein123 requested changes May 7, 2019

View reviewed changes

repokitteh-read-only bot added the waiting label May 7, 2019

move clearPendingFlag to helper method

5510ac7

Signed-off-by: Snow Pettersen <snowp@squareup.com>

repokitteh-read-only bot removed the waiting label May 8, 2019

mattklein123 approved these changes May 8, 2019

View reviewed changes

snowp merged commit 4c80194 into envoyproxy:master May 9, 2019

Conversation

snowp commented May 3, 2019

Uh oh!

snowp commented May 3, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

snowp commented May 3, 2019

Uh oh!

mattklein123 commented May 3, 2019

Uh oh!

snowp commented May 4, 2019

Uh oh!

snowp commented May 5, 2019

Uh oh!

mattklein123 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

snowp commented May 7, 2019

Uh oh!

mattklein123 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jmarantz commented May 7, 2019

Uh oh!

mattklein123 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

mattklein123 left a comment

Choose a reason for hiding this comment

Uh oh!

snowp commented May 9, 2019

Uh oh!

htuch commented May 10, 2019

Uh oh!

lita commented May 10, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants