Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HOSTEDCP-1984: Refactor capi logic out from NodePool controller #4795

Merged
merged 1 commit into from
Sep 26, 2024

Conversation

enxebre
Copy link
Member

@enxebre enxebre commented Sep 25, 2024

This moves the CAPI related logic into their own file and add test coverage particularly for the reconcile function. Additional refactor of the business logic itself is left out intentionally for now to contain the scope of the refactor and avoid backward compatibility issues.

➜  hypershift git:(nodepool-refactor-capi) ✗ go tool cover -func=coverage.out | grep capi
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:45:			newCAPI						100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:60:			Reconcile					65.2%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:175:			cleanupMachineTemplates				78.1%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:235:			deleteMachineDeployment				30.8%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:256:			pauseMachineDeployment				70.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:274:			deleteMachineSet				30.8%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:295:			Pause						60.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:309:			pauseMachineSet					70.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:329:			deleteMachineHealthCheck			30.8%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:350:			reconcileMachineDeployment			72.2%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:571:			taintsToJSON					75.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:580:			reconcileMachineHealthCheck			96.4%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:662:			setMachineDeploymentReplicas			100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:695:			machineTemplateBuilders				26.3%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:852:			generateMachineTemplateName			100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:859:			reconcileMachineSet				0.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1112:			machineSetInPlaceRolloutIsComplete		0.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1118:			setMachineSetReplicas				78.6%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1148:			getInPlaceMaxUnavailable			90.9%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1166:			machineDeployment				100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1175:			machineSet					100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1184:			machineHealthCheck				100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1195:			generateName					100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1204:			getName						66.7%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1230:			max						0.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1238:			min						0.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1245:			listMachineTemplates				59.3%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1306:			ensureMachineDeletion				66.7%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1321:			getMachinesForNodePool				90.0%

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, use fixes #<issue_number>(, fixes #<issue_number>, ...) format, where issue_number might be a GitHub issue, or a Jira story:
Fixes #

Checklist

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

@openshift-ci openshift-ci bot added the area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release label Sep 25, 2024
Copy link
Contributor

openshift-ci bot commented Sep 25, 2024

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: enxebre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added approved Indicates a PR has been approved by an approver from all required OWNERS files. and removed do-not-merge/needs-area labels Sep 25, 2024
if err != nil {
return err
}
if result, err := c.CreateOrUpdate(ctx, c.Client, template, func() error {
Copy link
Contributor

@muraee muraee Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are introducing a tight coupling with the Token struct by using its client, createOrUpdate, nodePool, etc. fields.
should we explicitly pass those options to the Reconcile() function or add them as fields to CAPI?

I would imagine the Token object we pass to CAPI to just be an interface exposing the functions CAPI needs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree, I'd expect that's exactly the end state (fields in capi + interface) and even consider separate packages so unexported fields are not accessible. I chose coupling for simplicity for now, the fields are accessible within the same package anyways and otherwise you would endup with "duplicated" fields which is also confusing. Any follow up refinement in that direction can be done now with confidence since we have decent coverage.
Added a TODO

}
return fmt.Errorf("error getting MachineHealthCheck: %w", err)
}
if mhc.DeletionTimestamp != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need to fetch the object and check the DeletionTimestamp first? isn't Delete a no-op if the object is already deleting?

Copy link
Member Author

@enxebre enxebre Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is just the business logic of the original function. I'd prefer to keep the scope of this PR small to reduce risk of regression and let any follow up iteration be test driven now we have decent coverage.
Added a TODO

// TODO(Alberto): drop this an rely on core in-place propagation once CAPI 1.4.0 https://github.com/kubernetes-sigs/cluster-api/releases comes through the payload.
// https://issues.redhat.com/browse/HOSTEDCP-971
machineList := &capiv1.MachineList{}
if err := c.List(context.TODO(), machineList, client.InNamespace(machineDeployment.Namespace)); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should pass a proper context here instead of context.TODO()

machine.Labels = make(map[string]string)
}

if result, err := controllerutil.CreateOrPatch(context.TODO(), c.Client, &machine, func() error {
Copy link
Contributor

@muraee muraee Sep 25, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should pass a proper context here instead of context.TODO()

Copy link

netlify bot commented Sep 25, 2024

Deploy Preview for hypershift-docs ready!

Name Link
🔨 Latest commit ab740d1
🔍 Latest deploy log https://app.netlify.com/sites/hypershift-docs/deploys/66f3ec0db7f77a0008736074
😎 Deploy Preview https://deploy-preview-4795--hypershift-docs.netlify.app
📱 Preview on mobile
Toggle QR Code...

QR Code

Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site configuration.

if nodePool.Spec.Platform.AWS.AMI != "" {
ami = nodePool.Spec.Platform.AWS.AMI
} else {
// TODO: Should the region be included in the NodePool platform information?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we still need this TODO?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't want to couple dropping that with this PR. fwiw this is just moving existing code around, intentionally modifying as less as possible in this PR and increasing coverage to an acceptable bar to enable further changes

// Check if platform machine template needs to be updated.
targetMachineTemplate := template.GetName()
if isUpdatingMachineTemplate(nodePool, targetMachineTemplate) {
// TODO (alberto): deocuple all conditions handling from this file into nodepool_controller.go dedicated function.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should tag Jira tickets to TODOs in the code so we don't lose track of items like this.

// Set defaults. These are normally set by the CAPI machinedeployment webhook.
// However, since we don't run the webhook, CAPI updates the machinedeployment
// after it has been created with defaults.
machineDeployment.Spec.MinReadySeconds = k8sutilspointer.Int32(0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

k8sutilspointer.Int32 is deprecated. These should be updated to use ptr.To[int32]

Kind: gvk.Kind,
APIVersion: gvk.GroupVersion().String(),
Namespace: machineTemplateCR.GetNamespace(),
// keep current tempalte name for later check.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
// keep current tempalte name for later check.
// keep current template name for later check.

targetVersion := c.Version()
targetConfigHash := c.HashWithoutVersion()
targetConfigVersionHash := c.Hash()
if userDataSecret.Name != k8sutilspointer.StringDeref(machineDeployment.Spec.Template.Spec.Bootstrap.DataSecretName, "") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

StringDeref is also deprecated. Should use ptr.Deref

}

func deleteMachineSet(ctx context.Context, c client.Client, ms *capiv1.MachineSet) error {
// TODO(alberto): why do we need to fetch the object and check the DeletionTimestamp first?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO should link to a Jira ticket

}

func deleteMachineHealthCheck(ctx context.Context, c client.Client, mhc *capiv1.MachineHealthCheck) error {
// TODO(alberto): why do we need to fetch the object and check the DeletionTimestamp first?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO should link to a Jira ticket

}
}

// TODO (alberto) drop this deterministic naming logic and get the name for child MachineDeployment from the status/annotation/label?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO should link to a Jira ticket

return filtered, nil
}

// TODO (alberto): Let the all the deletion logic be a capi func.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO should link to a Jira ticket

}

func TestMachineTemplateBuildersPreexisting(t *testing.T) {
//RunTestMachineTemplateBuilders(t, true)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this need uncommented?

This moves the CAPI related logic into their own file and add test coverage particularly for the reconcile function.
Additional refactor of the business logic itself is left out intentionally for now to contain the scope of the refactor and avoid backward compatibility issues.
@enxebre
Copy link
Member Author

enxebre commented Sep 25, 2024

TODO should link to a Jira ticket

Fleshing out all the follow ups for these series of refactors is a task on its own not to be done in this PR yet. The TODOs are the placeholders for that task.

@enxebre
Copy link
Member Author

enxebre commented Sep 25, 2024

/retest

@muraee
Copy link
Contributor

muraee commented Sep 25, 2024

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Sep 25, 2024
@enxebre enxebre changed the title Refactor capi logic out from NodePool controller HOSTEDCP-1984: Refactor capi logic out from NodePool controller Sep 26, 2024
@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 26, 2024

@enxebre: This pull request references HOSTEDCP-1984 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

This moves the CAPI related logic into their own file and add test coverage particularly for the reconcile function. Additional refactor of the business logic itself is left out intentionally for now to contain the scope of the refactor and avoid backward compatibility issues.

➜  hypershift git:(nodepool-refactor-capi) ✗ go tool cover -func=coverage.out | grep capi
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:45:			newCAPI						100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:60:			Reconcile					65.2%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:175:			cleanupMachineTemplates				78.1%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:235:			deleteMachineDeployment				30.8%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:256:			pauseMachineDeployment				70.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:274:			deleteMachineSet				30.8%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:295:			Pause						60.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:309:			pauseMachineSet					70.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:329:			deleteMachineHealthCheck			30.8%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:350:			reconcileMachineDeployment			72.2%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:571:			taintsToJSON					75.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:580:			reconcileMachineHealthCheck			96.4%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:662:			setMachineDeploymentReplicas			100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:695:			machineTemplateBuilders				26.3%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:852:			generateMachineTemplateName			100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:859:			reconcileMachineSet				0.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1112:			machineSetInPlaceRolloutIsComplete		0.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1118:			setMachineSetReplicas				78.6%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1148:			getInPlaceMaxUnavailable			90.9%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1166:			machineDeployment				100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1175:			machineSet					100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1184:			machineHealthCheck				100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1195:			generateName					100.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1204:			getName						66.7%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1230:			max						0.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1238:			min						0.0%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1245:			listMachineTemplates				59.3%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1306:			ensureMachineDeletion				66.7%
github.com/openshift/hypershift/hypershift-operator/controllers/nodepool/capi.go:1321:			getMachinesForNodePool				90.0%

What this PR does / why we need it:

Which issue(s) this PR fixes (optional, use fixes #<issue_number>(, fixes #<issue_number>, ...) format, where issue_number might be a GitHub issue, or a Jira story:
Fixes #

Checklist

  • Subject and description added to both, commit and PR.
  • Relevant issues have been referenced.
  • This change includes docs.
  • This change includes unit tests.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot openshift-ci-robot added the jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. label Sep 26, 2024
@enxebre
Copy link
Member Author

enxebre commented Sep 26, 2024

/jira refresh

@openshift-ci-robot
Copy link

openshift-ci-robot commented Sep 26, 2024

@enxebre: This pull request references HOSTEDCP-1984 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the story to target the "4.18.0" version, but no target version was set.

In response to this:

/jira refresh

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link

/retest-required

Remaining retests: 0 against base HEAD 5ad3d23 and 2 for PR HEAD d965bf9 in total

Copy link
Contributor

openshift-ci bot commented Sep 26, 2024

@enxebre: all tests passed!

Full PR test history. Your PR dashboard.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@openshift-merge-bot openshift-merge-bot bot merged commit 238c6ac into openshift:main Sep 26, 2024
15 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/hypershift-operator Indicates the PR includes changes for the hypershift operator and API - outside an OCP release jira/valid-reference Indicates that this PR references a valid Jira ticket of any type. lgtm Indicates that a PR is ready to be merged.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants