-
Notifications
You must be signed in to change notification settings - Fork 886
Cherry-pick graceful LB changes #2157
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This PR contains a fix for moby/moby#30321. There was a moby/moby#31142 PR intending to fix the issue by adding a delay between disabling the service in the cluster and the shutdown of the tasks. However disabling the service was not deleting the service info in the cluster. Added a fix to delete service info from cluster and verified using siege to ensure there is zero downtime on rolling update of a service. Signed-off-by: abhi <[email protected]> (cherry picked from commit 4aeb1fc)
This patch attempts to allow endpoints to complete servicing connections while being removed from a service. The change adds a flag to the endpoint.deleteServiceInfoFromCluster() method to indicate whether this removal should fully remove connectivity through the load balancer to the endpoint or should just disable directing further connections to the endpoint. If the flag is 'false', then the load balancer assigns a weight of 0 to the endpoint but does not remove it as a linux load balancing destination. It does remove the endpoint as a docker load balancing endpoint but tracks it in a special map of "disabled-but-not- destroyed" load balancing endpoints. This allows traffic to continue flowing, at least under Linux. If the flag is 'true', then the code removes the endpoint entirely as a load balancing destination. The sandbox.DisableService() method invokes deleteServiceInfoFromCluster() with the flag sent to 'false', while the endpoint.sbLeave() method invokes it with the flag set to 'true' to complete the removal on endpoint finalization. Renaming the endpoint invokes deleteServiceInfoFromCluster() with the flag set to 'true' because renaming attempts to completely remove and then re-add each endpoint service entry. The controller.rmServiceBinding() method, which carries out the operation, similarly gets a new flag for whether to fully remove the endpoint. If the flag is false, it does the job of moving the endpoint from the load balancing set to the 'disabled' set. It then removes or de-weights the entry in the OS load balancing table via network.rmLBBackend(). It removes the service entirely via said method ONLY IF there are no more live or disabled load balancing endpoints. Similarly network.addLBBackend() requires slight tweaking to properly manage the disabled set. Finally, this change requires propagating the status of disabled service endpoints via the networkDB. Accordingly, the patch includes both code to generate and handle service update messages. It also augments the service structure with a ServiceDisabled boolean to convey whether an endpoint should ultimately be removed or just disabled. This, naturally, required a rebuild of the protocol buffer code as well. Signed-off-by: Chris Telfer <[email protected]> (cherry picked from commit 2af06a0)
fcrisciani
approved these changes
May 23, 2018
fcrisciani
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Contributor
Author
|
The TestRollingUpdate e2e test validates this change. Flavio updated it to
activate under the appropriate versions of EE and tested that it passed.
The PR to merge his change into e2e.
Cheers,
-- Chris Telfer
…On Thu, May 24, 2018 at 1:50 AM, Mehdi Ghazizdeh ***@***.***> wrote:
how was this tested?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#2157 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AK8iXpeUX-5Go1JD0U7Zfq4YjEPhDOHpks5t1kolgaJpZM4UJeZA>
.
|
Member
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Cherry-pick commits 4aeb1fc and 2af06a0 to the bump_17.06 branch. These commits combined allow libnetwork to gracefully remove load balancing endpoints while not disconnecting the backend endpoints until they finish servicing existing connections.