Update FPC invoker health reporting logic #5464
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
With FPC, the list of managed invoker ids is determined by what's in etcd. However the metric reporting and invokers api still assume that 0...n invokers exist up to the max available id in etcd and will auto fill in any missing ids as offline. This is fine for a range to fill in by the current min starting point, but it's not providing any value to assume things start at 0. A cluster could have a monotonically increasing range of ids and potentially if original nodes are lost they shouldn't be assumed to be filled back in at 0.
Further, the current implementation already will not include invokers that are offline at the high end of the pool since the api / metrics just assumes that max id stored in etcd is the max id of the cluster. Example ten node cluster of 0-9:
With this change, this applies the same functionality to the low end to account for clusters that don't always re-populate 0-x.
nodes 0 and 1 are down
the api will now return 2-9 as existing since 2 is the min remaining id in etcd
if node 5 is down in the middle between the min and max, that still will be auto-populated to be down.
It's also important to note that this change only applies to FPC as the 0-n expectation of invoker ids is more important for the original load balancer algorithm with co-prime hashing.
Related issue and scope
My changes affect the following components
Types of changes
Checklist: