Skip to content

Conversation

@avishayt
Copy link
Contributor

@avishayt avishayt commented Dec 7, 2021

The Agent platform uses cluster-api-provider-agent to match existing Agent resources with Machines. This is a platform-agnostic way of adding workers.

@netlify
Copy link

netlify bot commented Dec 7, 2021

✔️ Deploy Preview for hypershift-docs ready!

🔨 Explore the source changes: 6a13985

🔍 Inspect the deploy log: https://app.netlify.com/sites/hypershift-docs/deploys/61b841b3e78c10000979fb44

😎 Browse the preview: https://deploy-preview-760--hypershift-docs.netlify.app

## explicit
github.com/docker/libtrust
# github.com/eranco74/cluster-api-provider-agent/api v0.0.0-20211202085429-b633556695e5
## explicit; go 1.16
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#703 bumped to 1.17 - does this need updating?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TMU this means that this cluster-api-provider-agent version specifies go 1.16 in go.mod.
It shouldn't affect anything (see other packages below with explicit older go versions).
We will bump the cluster-api-provider-agent to 1.17 and vendor again once we have some functionality update

</td>
</tr><tr><td><p>&#34;Agent&#34;</p></td>
<td><p>AgentPlatform represents user supplied insfrastructure booted with agents.</p>
</td>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably need some docs somewhere that describe the management cluster dependencies that allow the agent provider to work (I'm not sure exactly where, but without prior knowledge this doc isn't self-explanatory like the cloud provider platforms)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, I opened a task on that

Copy link
Contributor

@alvaroaleman alvaroaleman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least some docs on how to use this and ideally an e2e test are also needed

return "", fmt.Errorf("unable to fetch node objects: %w", err)
}
if len(nodes.Items) < 1 {
return "", fmt.Errorf("no node objects found: %w", err)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The err here is always nil, as you return before if its non-nil

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code was moved from the None platform as-is for reuse by the Agent platform, I'd rather not change it in this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The err here is always nil, as you return before if its non-nil

Why though? would .List though an error if nodes.Items is of len 0? If so +1 to address separately and keep this PR focus on the multi-platform structure.

},
},
}
specHash := hashStruct(agentMachineTemplate.Spec.Template.Spec)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This means that if the spec ever gets changed (Doesn't happen today but might in the future?) we will create a new template, is that intended?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I copied this from AWSMachineTemplate in the same file, I assumed this is how it was intended in HyperShift

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any change to the nodePool platform specifics it's propagated and result in a new template being created so rolling upgrades can be supportd. This is by design atm. Template cleanup needs to be implemented in a platform agnostic way aside from this PR #760 (comment)

span.AddEvent("reconciled awsmachinetemplate", trace.WithAttributes(attribute.String("name", machineTemplate.GetName())))
case hyperv1.AgentPlatform:
machineTemplate = AgentMachineTemplate(nodePool, controlPlaneNamespace)
if err := r.Get(ctx, client.ObjectKeyFromObject(machineTemplate), machineTemplate); err != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use CreateOrUpdate? The current name-changing approach means that we orphan old templates on config change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I copied this from AWS, see reconcileAWSMachineTemplate() in the same file. It does get and create.

Copy link
Member

@enxebre enxebre Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Templates can't just be changed inline as that would break rolling upgrades. This needs to be solved in a provider agnostic way and it is tracked here https://github.com/openshift/hypershift/blob/main/hypershift-operator/controllers/nodepool/nodepool_controller.go#L606

Let's tackle this aside from this PR.

Copy link
Member

@enxebre enxebre Dec 9, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will need to be consolidated with some of the logic it's in reconcileAWSMachineTemplate

@openshift-ci openshift-ci bot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 10, 2021
@openshift-ci openshift-ci bot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Dec 13, 2021
@nirarg nirarg mentioned this pull request Dec 13, 2021
@avishayt
Copy link
Contributor Author

/hold all comments addressed, testing to validate

@openshift-ci openshift-ci bot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 13, 2021
@enxebre
Copy link
Member

enxebre commented Dec 13, 2021

github.com/openshift/hypershift/hypershift-operator/controllers/nodepool
hypershift-operator/controllers/nodepool/nodepool_controller.go:520:21: undefined: AgentMachineTemplate
make: *** [hypershift-operator] Error 2

Also please squash/rephrase "Fix review comments" commit.

@avishayt avishayt force-pushed the main branch 2 times, most recently from 6a13985 to fcc4b2f Compare December 14, 2021 09:38
go.mod Outdated
github.com/clarketm/json v1.14.1
github.com/coreos/ignition/v2 v2.10.1
github.com/docker/distribution v2.7.1+incompatible
github.com/eranco74/cluster-api-provider-agent/api v0.0.0-20211202085429-b633556695e5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

@enxebre
Copy link
Member

enxebre commented Dec 14, 2021

lgtm other than #760 (comment).
This is a good first step, let's update the repo ref and ship it. We can then do iterative follow ups and make sure there's docs and e2e test before we claim this has any level of support.

The Agent platform uses cluster-api-provider-agent to match existing
Agent resources with Machines. This is a platform-agnostic way of adding
workers.
@eranco74
Copy link
Contributor

/lgtm

@openshift-ci openshift-ci bot added the lgtm Indicates that a PR is ready to be merged. label Dec 14, 2021
@enxebre
Copy link
Member

enxebre commented Dec 14, 2021

As mentioned above, this is just an enabler development step towards eventually support an agent based workflow which will require docs and e2e testing.
This sets the basis for the agent based workflow and enable us to keep figuring and refining the right boundaries for multi-platform extensibility along with #779.

/approve

@avishayt
Copy link
Contributor Author

/cancel hold

@avishayt
Copy link
Contributor Author

/hold cancel

@openshift-ci openshift-ci bot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Dec 14, 2021
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 14, 2021

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: avishayt, enxebre

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci openshift-ci bot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Dec 14, 2021
@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

6 similar comments
@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-bot
Copy link

/retest-required

Please review the full test history for this PR and help us cut down flakes.

@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 14, 2021

@avishayt: all tests passed!

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here.

@openshift-merge-robot openshift-merge-robot merged commit b558edc into openshift:main Dec 14, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

approved Indicates a PR has been approved by an approver from all required OWNERS files. lgtm Indicates that a PR is ready to be merged.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

7 participants