Skip to content

Support for external infra clusters#39

Merged
k8s-ci-robot merged 5 commits intokubernetes-sigs:mainfrom
agradouski:main
Dec 20, 2021
Merged

Support for external infra clusters#39
k8s-ci-robot merged 5 commits intokubernetes-sigs:mainfrom
agradouski:main

Conversation

@agradouski
Copy link
Copy Markdown
Contributor

What this PR does / why we need it:
Currently, there is an assumption that both management resources and underlying infrastructure resources (Kubevirt VMs) are created on a single cluster. In other words, management cluster is the same as infrastructure cluster.

This change adds support for external infrastructure clusters. Meaning, that we now can decouple the management cluster that will only maintain the management resources for tenant cluster with actual infrastructure for tenant clusters.

This is achieved by injecting an external cluster kubeconfig into management cluster, and referencing it in the KubevirtCluster resource. The logic will use that kubeconfig if it's provided and create all infra resources (VMs) in that cluster.

This change is backward compatible in that if infraClusterSecretRef is not provided in KubevirtCluster's spec, then the same cluster will be used for tenant cluster infra.

Note, in order to use external cluster as infra, connectivity need to be ensured between the management cluster and that infra cluster.

Which issue this PR fixes: fixes #32

Special notes for your reviewer:

  • a new template is an example how an external cluster can be referenced in the template

Release notes:

@linux-foundation-easycla
Copy link
Copy Markdown

linux-foundation-easycla bot commented Dec 10, 2021

CLA Signed

The committers are authorized under a signed CLA.

@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Dec 10, 2021
@k8s-ci-robot k8s-ci-robot added approved Indicates a PR has been approved by an approver from all required OWNERS files. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 10, 2021
- add InfraClusterSecretRef to KubevirtClusterSpec
- add example template
- add InfraCluster construct to incapsulate infra cluster client and context
- move cluster management resource cleanup logic to `reconcileDelete` func in controllers
- enhance logging

Signed-off-by: Alex Gradouski <agradouski@apple.com>
Copy link
Copy Markdown
Contributor

@davidvossel davidvossel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, I had one kind of high level comment about how to differentiate the KubeVirtMachine's namespace from the underlying KubeVirt VM's namespace on the external cluster (more detailed comment is made in-line inside the review)

The suggestion of using a VMITemplate in our api might also help us in the future if we ever need to set special annotations/labels on the VMI which is independent of the KubeVirtMachine.

Comment on lines +174 to +182
if ctx.KubevirtMachine.Spec.ProviderID != nil {
ctx.Logger.Info("KubevirtMachine.Spec.ProviderID is set -- VM provisioning complete!")
// ensure ready state is set.
// This is required after move, because status is not moved to the target cluster.
ctx.KubevirtMachine.Status.Ready = true
conditions.MarkTrue(ctx.KubevirtMachine, infrav1.VMProvisionedCondition)
return ctrl.Result{}, nil
} else {
ctx.Logger.Info("KubevirtMachine.Spec.ProviderID is not set -- will work on bootstrapping machine...")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change shouldn't be necessary, right? I think ctx.KubevirtMachine.Status.Ready will only be true once providerId and the VMProvisionedCondition are set.

ctx.Logger.Info("Waiting for the Bootstrap provider controller to set bootstrap data")
infraClusterClient, infraClusterNamespace, err := r.InfraCluster.GenerateInfraClusterClient(ctx.ClusterContext())
if err != nil {
ctx.Logger.Error(err, "Infra cluster client is NOT available.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this return an error as well as the log message?

I get that the next line below this will cause the function to return most likely in the case that infraClusterClient == nil. It just catches my eye when we attempt to use results from a function that returns an error.

ctx.Logger.Info("Checking if VM already exists...")
// Create a helper for managing the KubeVirt VM hosting the machine.
externalMachine, err := kubevirthandler.NewMachine(ctx, r.Client)
externalMachine, err := kubevirthandler.NewMachine(ctx, infraClusterClient, infraClusterNamespace)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right now, the namespace for the external KubeVirt VMs is inferred by introspecting the kubeconfig data for the external config. I'm unsure about that.

I think the issue we're trying to solve here is how to express the location of the underlying KubeVirt VM from the KubeVirtMachine's namespace.

One way to do this is to modify our KubeVirtMachine API to use a VMI template instead of a VMI Spec.

so, This

// KubevirtMachineSpec defines the desired state of KubevirtMachine.                                                           
type KubevirtMachineSpec struct {                                                                                              
        VMSpec kubevirtv1.VirtualMachineInstanceSpec `json:"vmSpec,omitempty"`                                                 
                                                                                                                               

Could change to this...

// KubevirtMachineSpec defines the desired state of KubevirtMachine.                                                           
type KubevirtMachineSpec struct {                                                                                              
        VMITemplate kubevirtv1.VirtualMachineInstanceTemplateSpec `json:"vmiTemplate,omitempty"`                                                                                                                                                                                                               

The advantage here is that the VMITemplate object has a ObjectMetadata field, where we can specify labels,annotations and even the namespace of the VMI independently of the KubeVirtMachine.

func (r *KubevirtMachineReconciler) reconcileDelete(ctx *context.MachineContext) (ctrl.Result, error) {
infraClusterClient, infraClusterNamespace, err := r.InfraCluster.GenerateInfraClusterClient(ctx.ClusterContext())
if err != nil {
ctx.Logger.Error(err, "Infra cluster client is not available.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like we should return an error here.


ctx.Logger.Info("Deleting VM bootstrap secret...")
if err := r.deleteKubevirtBootstrapSecret(ctx, infraClusterClient, infraClusterNamespace); err != nil {
ctx.Logger.Error(err, "Failed to delete bootstrap secret.")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

another place where maybe a return is needed?

@agradouski
Copy link
Copy Markdown
Contributor Author

thanks for your review @davidvossel !

regarding the namespace for KubeVirt VMs, completely agree. this is a bit of a hack for time being, that I was looking to address in a future PR. in my mind it's blocked on #11, aligned with your thinking. we need that PR to land, in order to fix the issue with the namespace here, as well as some other internal issue that requires VMI labels to be set via the template. (today it's not possible, without VMI -> VM spec update.)

completely agree with other comments regarding return statements, working on the fix.

- return on errors, with retries

Signed-off-by: Alex Gradouski <agradouski@apple.com>
@k8s-ci-robot k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 11, 2021
@davidvossel
Copy link
Copy Markdown
Contributor

regarding the namespace for KubeVirt VMs, completely agree. this is a bit of a hack for time being, that I was looking to address in a future PR. in my mind it's blocked on #11, aligned with your thinking. we need that PR to land, in order to fix the issue with the namespace here

@agradouski yep, makes total sense.

while #11 is being worked on, how do you feel about this external cluster PR? Do you think we should wait on #11 to land, or continue forward with the external cluster independently (sorting out the namespace issue later on)?

I think this PR is fine as long as we're proceeding with the understanding that the namespace of the external VMs will be impacted by #11 later on, likely in a backwards incompatible way.

@agradouski
Copy link
Copy Markdown
Contributor Author

@davidvossel yes, I'm okay with proceeding with external cluster work independently of #11 for now, with the understanding that it will need to be updated with the follow-up PR.

Created issue to address that in the future: #51

Signed-off-by: Alex Gradouski <agradouski@apple.com>
@k8s-ci-robot k8s-ci-robot added size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. and removed size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Dec 18, 2021
@k8s-ci-robot k8s-ci-robot added size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. and removed needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. labels Dec 18, 2021
Signed-off-by: Alex Gradouski <agradouski@apple.com>
Copy link
Copy Markdown
Contributor

@cchengleo cchengleo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Dec 20, 2021
@k8s-ci-robot
Copy link
Copy Markdown
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: agradouski, cchengleo

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:
  • OWNERS [agradouski,cchengleo]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit 5dfba0e into kubernetes-sigs:main Dec 20, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. lgtm "Looks good to me", indicates that a PR is ready to be merged. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support for external infrastructure clusters

4 participants