Skip to content

[WIP] add experimental support for Service Type=LoadBalancer w/ NSX-T#272

Closed
andrewsykim wants to merge 3 commits intokubernetes:masterfrom
andrewsykim:nsx-t
Closed

[WIP] add experimental support for Service Type=LoadBalancer w/ NSX-T#272
andrewsykim wants to merge 3 commits intokubernetes:masterfrom
andrewsykim:nsx-t

Conversation

@andrewsykim
Copy link
Member

@andrewsykim andrewsykim commented Nov 3, 2019

Signed-off-by: Andrew Sy Kim kiman@vmware.com

What this PR does / why we need it:
Adds experimental support for NSX-T load balancers when a Service Type=LoadBalancer is created in a cluster using the vSphere cloud provider.

This PR is still a work in progress, some pending tasks include but are not limited to:

  • a lot more testing/validation (manual, unit tests, etc)
  • agreement on the config file changes required to enable NSX-T, for now I just put whatever I needed to get this working in the config file but I think we should discuss what makes the most sense there, there are some values that are also hardcoded (like LB Service size), we should discuss how we want them to be configurable
  • how to mark this feature as "experimental", maybe an env var gate? Ideally something not in the config file cause changing config file fields in the future is painful.

Which issue this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close that issue when PR gets merged): fixes #

Special notes for your reviewer:
For some reason listing loadbalancer resources was not available on the go-vmware-nsxt client so I had to write functions to do this using http stdlib. Opened vmware/go-vmware-nsxt#18 in hopes to get it in the client.

Release note:

add experimental support for Service Type=LoadBalancer w/ NSX-T

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Nov 3, 2019
@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign andrewsykim
You can assign the PR to them by writing /assign @andrewsykim in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. label Nov 3, 2019
@andrewsykim andrewsykim force-pushed the nsx-t branch 2 times, most recently from 4bdbe49 to b963a73 Compare November 3, 2019 03:11
Signed-off-by: Andrew Sy Kim <kiman@vmware.com>
Signed-off-by: Andrew Sy Kim <kiman@vmware.com>
Signed-off-by: Andrew Sy Kim <kiman@vmware.com>
@k8s-ci-robot
Copy link
Contributor

@andrewsykim: The following tests failed, say /retest to rerun them all:

Test name Commit Details Rerun command
pull-cloud-provider-vsphere-unit-test d88fa99 link /test pull-cloud-provider-vsphere-unit-test
pull-cloud-provider-vsphere-verify-staticcheck d88fa99 link /test pull-cloud-provider-vsphere-verify-staticcheck
pull-cloud-provider-vsphere-integration-test d88fa99 link /test pull-cloud-provider-vsphere-integration-test

Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@mandelsoft
Copy link
Contributor

mandelsoft commented Dec 9, 2019

Hi, we from the Gardener team had a workshop on friday with some vmware folks.
Padmaja Vrudhula asks us to review your pull request. We are currently developing the
vmware extension for the gardener (https://github.com/gardener/gardener). Therefore
we required load balancer support. We also have developed a load balancer controller
(https://github.com/MartinWeindel/nsxt-lb-provider). It is a regular cloud controller
manager only implementing the load balancer part and can run beside the
vsphere-cloud-controller-manager. It is not yet part of the gardener project,
because we wanted to finalize it first.

When looking at your pull request, we found several issues:

  • it does not handle a change of the port list of a service
    it only adds virtual servers, but does not delete them for vanished ports
  • it does not create health checks, this might be seen as superfluous, but for
    other providers this is done.
  • there is no garbage collection. It does not cleanup orphaned elements
    (for example, if a service is forcefully removed, or if there are problems
    with the controller or if it unexpectedly crashes or is aborted)
  • everything is identified by generated names. I'm not sure that 5 characters from the
    service UUID are really unique. To find the elements in the UI it might also be useful to
    additionally use the cluster name.
  • it is only possible to use one IPPool for the load balancer IPs. But it might be
    useful to offer a cluster user the possibility to choose among multiple possibilities,
    (for example a public and a private load balancer) according to preconfigured
    sets with different visibility or routing settings. We have already added this
    feature to the openstack cloud provider, which is also used for on-prem scenarios.
  • the load balancer service is not managed
  • our admin told us, that it might be helpful to add tags to the generated
    elements to be able to configure firewall rules in NSX-T.

We had implemented all this in our proposal, maybe we can combine both approaches
to get rid at all of our local project. It would be great if we could find a common solution.
If useful, we could adapt our approach to be added to the regular vSphere cloud controller
manager. (We originally decided to create an own controller manager, because it requires
NSX-T, while the vSphere cloud controller manager is purely based on vSphere.)

@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 9, 2019
@andrewsykim
Copy link
Member Author

Hi @mandelsoft! Sorry for the late reply, I was on paternity leave for the past few weeks! I agree it would be great if we can consolidate all implementations here so users only have to deploy 1 cloud controller manager. I think we should discuss this in the next vSphere subproject meeting. @frapposelli @dvonthenen @maplain can we add this to the next meeting agenda?

@k8s-ci-robot k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 7, 2020
@davidvonthenen
Copy link
Contributor

Hi @mandelsoft! Sorry for the late reply, I was on paternity leave for the past few weeks! I agree it would be great if we can consolidate all implementations here so users only have to deploy 1 cloud controller manager. I think we should discuss this in the next vSphere subproject meeting. @frapposelli @dvonthenen @maplain can we add this to the next meeting agenda?

I added this to agenda for the Feb 5th meeting

@mandelsoft
Copy link
Contributor

Hi @andrewsykim , @dvonthenen, to not lose just another month, should we prepare a pull request until Feb, 5th?

@andrewsykim
Copy link
Member Author

@mandelsoft that sounds good, if you open a PR based on your work in https://github.com/MartinWeindel/nsxt-lb-provider we can close this one and collaborate there.

@mandelsoft
Copy link
Contributor

Hi @andrewsykim, I'm just working on the pull request and saw that we are still using an unmerged pull request vmware/go-vmware-nsxt#19 adding some missing rest calls for NSX-T required by our solution. May be you have the chance to trigger this internally.

@andrewsykim
Copy link
Member Author

andrewsykim commented Jan 14, 2020

@mandelsoft for now you can implement those API calls as internal functions (see listLoadBalancerVirtualServers in this PR for an example)

@davidvonthenen
Copy link
Contributor

FYI

Different protocols in the same Service definition with type=LoadBalancer - kubernetes/enhancements#1438

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. labels Apr 14, 2020
@k8s-ci-robot
Copy link
Contributor

@andrewsykim: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@andrewsykim
Copy link
Member Author

/close

@k8s-ci-robot
Copy link
Contributor

@andrewsykim: Closed this PR.

Details

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@andrewsykim
Copy link
Member Author

replaced by #292

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants