Skip to content

Conversation

@Karthik-K-N
Copy link
Member

@Karthik-K-N Karthik-K-N commented Mar 10, 2023

With the introduction of cluster-control-plane-machine-set-operator(CPMS operator) now we can replace the control-plane machines in openshift cluster.
Control-plance machines will be attached to two loadbalancer in case of public cluster during the cluster creation and it will be done by the installer component.
When there is change in cpms object, the cpms operator will trigger machine replacement and during this time we need to update the loadbalancer with newly created vm ip and remove the ip of machine that is going to be deleted.

The sample control-plane machine object looks like this

apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
  creationTimestamp: null
  labels:
    machine.openshift.io/cluster-api-cluster: rdr-kn01032033-gjnlr
    machine.openshift.io/cluster-api-machine-role: master
    machine.openshift.io/cluster-api-machine-type: master
  name: karthik-master-1
  namespace: openshift-machine-api
spec:
  providerSpec:
    value:
      apiVersion: machine.openshift.io/v1
      credentialsSecret:
        name: powervs-credentials
      image:
        name: rhcos-rdr-kn08031125-vhwtg
        type: Name
      keyPairName: vhwtg-key
      kind: PowerVSMachineProviderConfig
      loadBalancers:
      - name: rdr-kn08031125-vhwtg-loadbalancer-int
        type: Application
      - name: rdr-kn08031125-vhwtg-loadbalancer
        type: Application
      memoryGiB: 32
      network:
        regex: ^DHCPSERVER.*rdr-kn08031125-vhwtg.*_Private$
        type: RegEx
      processorType: Shared
      processors: "0.5"
      serviceInstance:
        id: df1a3112-5067-4366-8a44-fdfcec
        type: ID
      systemType: s922
      userDataSecret:
        name: master-user-data

Proposed api changes: openshift/api#1415

Pending work

  • Unit testcases
  • Use paging mechanism

@openshift-ci openshift-ci bot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 10, 2023
@Karthik-K-N
Copy link
Member Author

@Prajyot-Parab @dharaneeshvrd Please take a look and provide your feedback.

@openshift-ci openshift-ci bot requested review from JoelSpeed and mkumatag March 10, 2023 07:52
@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 10, 2023
}
if !alreadyRegistered {
// make sure that LoadBalancer is in active state
loadBalancer, _, err := powerVSClient.GetLoadBalancer(&vpcv1.GetLoadBalancerOptions{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need another Get call here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In our case a loadbalancer will have 2 pools, So after updating one pool, if we try to update second one it will fail as the loadbalancer wont be in active state, so instead of throwing ugly error added a check.

klog.Infof("Deleting IP %s from LoadBalancer %s ", internalIP, lbName)

// Fetch the LoadBalancer details
loadBalancer, _, err := powerVSClient.GetLoadBalancer(&vpcv1.GetLoadBalancerOptions{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as previous comment

@Karthik-K-N
Copy link
Member Author

@dharaneeshvrd @Prajyot-Parab Please take a look when you get a chance, If the approach looks good I will go ahead and fix CI errors and add UT.


internalIP := r.getMachineInternalIP()
if internalIP != "" {
klog.Infof("deregister ip %s of machine %s form LoadBalancer", internalIP, r.machine.Name)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor Nit:

Suggested change
klog.Infof("deregister ip %s of machine %s form LoadBalancer", internalIP, r.machine.Name)
klog.Infof("deregister ip %s of machine %s from LoadBalancer", internalIP, r.machine.Name)

Copy link
Contributor

@JoelSpeed JoelSpeed left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have any feedback here apart from styling nits, I'd suggested always wrapping errors with additional context where possible, it tends to help with the flow of errors when debugging IMO, I'll leave the call to this repos maintainers though

internalIP := r.getMachineInternalIP()
if internalIP != "" {
klog.Infof("deregister ip %s of machine %s form LoadBalancer", internalIP, r.machine.Name)
if err = r.removeFromApplicationLoadBalancers(internalIP); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: no need to shadow the err var when you do this inline if style

Suggested change
if err = r.removeFromApplicationLoadBalancers(internalIP); err != nil {
if err := r.removeFromApplicationLoadBalancers(internalIP); err != nil {

Comment on lines 494 to 495
err := registerWithApplicationLoadBalancers(r.powerVSClient, applicationLoadBalancerNames, internalIP)
if err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: should inline this

Suggested change
err := registerWithApplicationLoadBalancers(r.powerVSClient, applicationLoadBalancerNames, internalIP)
if err != nil {
if err := registerWithApplicationLoadBalancers(r.powerVSClient, applicationLoadBalancerNames, internalIP); err != nil {

err := registerWithApplicationLoadBalancers(r.powerVSClient, applicationLoadBalancerNames, internalIP)
if err != nil {
klog.Errorf("%s: Failed to register application load balancers: %v", r.machine.Name, err)
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should wrap this error with additional context before returning

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Karthik-K-N please take care of this comment..

func registerWithApplicationLoadBalancers(powerVSClient client.Client, loadBalancerNames []string, internalIP string) error {
lbMap, err := getLoadBalancers(powerVSClient, loadBalancerNames)
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should wrap this error with additional context before returning

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

}

if err := utils.PagingHelper(f); err != nil {
return nil, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should wrap the error

}

if loadBalancersList == nil {
return nil, fmt.Errorf("no loadbalancer is retrieved")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No format string here

Suggested change
return nil, fmt.Errorf("no loadbalancer is retrieved")
return nil, errors.New("no loadbalancer is retrieved")

ID: &lbOptions.id,
})
if err != nil {
return err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should add context

})
response, _, err := powerVSClient.CreateLoadBalancerPoolMember(options)
if err != nil {
return errors.Wrapf(err, "error creating LoadBalacner pool member")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use fmt.Errorf for consistency with the rest of this PR

@mkumatag
Copy link
Member

mkumatag commented Apr 6, 2023

I don't have any feedback here apart from styling nits, I'd suggested always wrapping errors with additional context where possible, it tends to help with the flow of errors when debugging IMO, I'll leave the call to this repos maintainers though

Agree, @Karthik-K-N lets fix this wherever possible as a separate PR.

@Karthik-K-N Karthik-K-N force-pushed the loadbalancer-support branch from 2bc3202 to 24d105c Compare April 6, 2023 10:50
@Karthik-K-N Karthik-K-N changed the title [WIP][DNM] Loadbalancer integration support for control plane machines Loadbalancer integration support for control plane machines Apr 6, 2023
@openshift-ci openshift-ci bot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 6, 2023
}
}
if len(lbMap) != len(loadBalancerNames) {
return nil, fmt.Errorf("not able to find %s loadbalancer in cloud", loadBalancerNames)
Copy link
Member

@dharaneeshvrd dharaneeshvrd Apr 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a possibility that this will have load balancers that are exist too in case of more than one load balancer is on search.

Copy link
Member

@dharaneeshvrd dharaneeshvrd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good to me!

Copy link
Member

@Prajyot-Parab Prajyot-Parab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

c.ImageClient = instance.NewIBMPIImageClient(ctx, c.session, cloudInstanceID)
c.DHCPClient = instance.NewIBMPIDhcpClient(ctx, c.session, cloudInstanceID)

vpcZone, err := utils.VPCRegionForPowerVSRegion(c.region)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be a region

Suggested change
vpcZone, err := utils.VPCRegionForPowerVSRegion(c.region)
vpcRegion, err := utils.VPCRegionForPowerVSRegion(c.region)

go.mod Outdated
Comment on lines 30 to 33
require (
github.com/IBM/vpc-go-sdk v0.35.0
github.com/onsi/ginkgo/v2 v2.8.1
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any specific reason for creating a different block? if not - can this be merged with above block?

err := registerWithApplicationLoadBalancers(r.powerVSClient, applicationLoadBalancerNames, internalIP)
if err != nil {
klog.Errorf("%s: Failed to register application load balancers: %v", r.machine.Name, err)
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Karthik-K-N please take care of this comment..

func registerWithApplicationLoadBalancers(powerVSClient client.Client, loadBalancerNames []string, internalIP string) error {
lbMap, err := getLoadBalancers(powerVSClient, loadBalancerNames)
if err != nil {
return err
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

@JoelSpeed
Copy link
Contributor

@Karthik-K-N Just wanted to check if you need any additional assistance or review here from me

@Karthik-K-N
Copy link
Member Author

@Karthik-K-N Just wanted to check if you need any additional assistance or review here from me

Thank you, I tried to address all the review comments and its ready for next round of review.

@JoelSpeed
Copy link
Contributor

General structure LGTM, will defer to @mkumatag to apply the label when they're happy with this

@Karthik-K-N Karthik-K-N requested a review from mkumatag May 17, 2023 06:27
@mkumatag
Copy link
Member

@Karthik-K-N overall lgtm, can you squash the commits and split into, one for go mods and another for code changes?

@Karthik-K-N Karthik-K-N force-pushed the loadbalancer-support branch 2 times, most recently from 046fe75 to a642495 Compare May 18, 2023 05:25
@mkumatag
Copy link
Member

@Karthik-K-N but I see Add loadbalancer pool logic contains the go mod and the vendor files.

image

@Karthik-K-N Karthik-K-N force-pushed the loadbalancer-support branch from a642495 to c3d71ca Compare May 19, 2023 03:24
@mkumatag
Copy link
Member

mkumatag commented Jun 7, 2023

/lgtm

@Karthik-K-N lets create a jira ticket if doesn't exist and map to this issue

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Jun 7, 2023
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 7, 2023

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: Karthik-K-N, mkumatag

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [Karthik-K-N,mkumatag]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@Karthik-K-N Karthik-K-N changed the title Loadbalancer integration support for control plane machines MULTIARCH-3667: Loadbalancer integration support for control plane machines Jun 7, 2023
@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Jun 7, 2023
@openshift-ci-robot
Copy link

openshift-ci-robot commented Jun 7, 2023

@Karthik-K-N: This pull request references MULTIARCH-3667 which is a valid jira issue.

Details

In response to this:

With the introduction of cluster-control-plane-machine-set-operator(CPMS operator) now we can replace the control-plane machines in openshift cluster.
Control-plance machines will be attached to two loadbalancer in case of public cluster during the cluster creation and it will be done by the installer component.
When there is change in cpms object, the cpms operator will trigger machine replacement and during this time we need to update the loadbalancer with newly created vm ip and remove the ip of machine that is going to be deleted.

The sample control-plane machine object looks like this

apiVersion: machine.openshift.io/v1beta1
kind: Machine
metadata:
 creationTimestamp: null
 labels:
   machine.openshift.io/cluster-api-cluster: rdr-kn01032033-gjnlr
   machine.openshift.io/cluster-api-machine-role: master
   machine.openshift.io/cluster-api-machine-type: master
 name: karthik-master-1
 namespace: openshift-machine-api
spec:
 providerSpec:
   value:
     apiVersion: machine.openshift.io/v1
     credentialsSecret:
       name: powervs-credentials
     image:
       name: rhcos-rdr-kn08031125-vhwtg
       type: Name
     keyPairName: vhwtg-key
     kind: PowerVSMachineProviderConfig
     loadBalancers:
     - name: rdr-kn08031125-vhwtg-loadbalancer-int
       type: Application
     - name: rdr-kn08031125-vhwtg-loadbalancer
       type: Application
     memoryGiB: 32
     network:
       regex: ^DHCPSERVER.*rdr-kn08031125-vhwtg.*_Private$
       type: RegEx
     processorType: Shared
     processors: "0.5"
     serviceInstance:
       id: df1a3112-5067-4366-8a44-fdfcec
       type: ID
     systemType: s922
     userDataSecret:
       name: master-user-data

Proposed api changes: openshift/api#1415

Pending work

  • Unit testcases
  • Use paging mechanism

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 19, 2023

@Karthik-K-N: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit 68477f0 into openshift:main Jun 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants