Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

TCP health check removed from LB when scaling cluster #281

Open
vxav opened this issue Aug 10, 2023 · 10 comments
Open

TCP health check removed from LB when scaling cluster #281

vxav opened this issue Aug 10, 2023 · 10 comments
Assignees
Labels
bug Something isn't working

Comments

@vxav
Copy link

vxav commented Aug 10, 2023

Describe the bug

When nodes are added or removed, the TCP health check is removed from the load balancer in VCD.

When the deprecated machine IP gets removed from the LB, the TCP check is removed at the same time and not recreated.

Reproduction steps

  1. Create a service type load balancer
  2. Ensure it has the TCP health check enabled
  3. Add a replica to the machineDeployment
  4. Observe that the TCP check is no longer in the LB
    ...

Expected behavior

The TCP check should remain at all times.

Additional context

No response

@ltimothy7
Copy link
Contributor

Hi @vxav
Thank you for opening this. We have tried to reproduce this by resizing the worker nodes and control plane nodes separately, but the health check is still up. Have you been able to reproduce this issue?

@vxav
Copy link
Author

vxav commented Sep 26, 2023

This is strange, I checked again and I had the same behaviour:

  • Create a service type load balancer
  • Ensure it has the TCP health check enabled
  • Add a replica to the machineDeployment
  • Observe that the TCP check is no longer in the LB

What CPI version do you run? We're on 1.2.0

@ltimothy7
Copy link
Contributor

We used CPI 1.4.0. Are you able to try 1.4.0 to see if this is fixed in the newer version?

If 1.2.0 just has the issue, we can make it a known issue

@ltimothy7
Copy link
Contributor

@vxav Just following up on this; otherwise, we can close the issue

@vxav
Copy link
Author

vxav commented Nov 13, 2023

@ltimothy7 Sorry for the delayed reply.
I tested with 1.4.1 and it is the same behaviour.

@vxav
Copy link
Author

vxav commented Nov 13, 2023

I confirm the following behaviour:

  • Create svc type load balancer => Created with TCP health check automatically
  • Scale machineDeployment
  • VM provisioned and added to the LB Pool => TCP health check is removed.

@ltimothy7
Copy link
Contributor

Thank you @vxav
For clarity, would you please list the instructions you are performing to check the TCP health check? For example, is it at the lb pool level or virtual service level?

Are you also scaling the machine Deployment by just scaling a worker node pool in the Container UI plugin?

Thank you

@vxav
Copy link
Author

vxav commented Nov 21, 2023

Yes I scale the machineDeployment CR and we don't use the UI.

For the health check it is indeed on the LB Pool that I check.

@mnspodrska
Copy link

We have used GUI and got the same result of health check being removed after adding new node.
Cloud Director 10.5
CSE 4.1.1a
TKG Product Version v2.2.0
Kubernetes Version v1.25.7+vmware.2
CAPVCD Version v1.1.1
CPI (Cloud Provider Interface) cloud-controller-manager 1.4.1

jjaferson pushed a commit to jjaferson/cloud-provider-for-cloud-director that referenced this issue Dec 11, 2023
Signed-off-by: Aniruddha Shamasundar <[email protected]>

Signed-off-by: Aniruddha Shamasundar <[email protected]>
@vxav
Copy link
Author

vxav commented May 14, 2024

Hey @ltimothy7, any news on this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants