Skip to content

Conversation

@oglok
Copy link
Contributor

@oglok oglok commented Dec 13, 2021

This commit only describes the addition of new compute nodes to an existing
MicroShift cluster. Highly available control plane will be described in later
PRs.

Signed-off-by: Ricardo Noriega [email protected]

This Enhacement proposal addresses part of the #460 epic.

  This commit only describes the addition of new  compute nodes to an existing
  MicroShift cluster. Highly available control plane will be described in later
  PRs.

Signed-off-by: Ricardo Noriega <[email protected]>
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Dec 13, 2021

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please ask for approval from oglok after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@cgwalters
Copy link
Member

There's a lot going on here. Something I am struggling to wrap my head around is that to me, a key point of "OpenShift 4" is that the cluster manages the OS. That isn't handled by MicroShift today (right?).

Now one thing I was asked to comment on here is the relationship to OCP node join. What I would say is basically all of that logic lives in https://github.com/openshift/cluster-machine-approver

OK I just did openshift/cluster-machine-approver#150 - I hope that's helpful.

* The `bootstrap-kubeconfig` asset must be placed in the new nodes to allow them to join the MicroShift cluster.
* New nodes will only contain the role `node` using the existing flag `--roles`. (Currently, MicroShift only supports one control plane entity and multiple nodes.)
* MicroShift will handle certificate rotation for security reasons on the new nodes.
* As a summary, the Kubelet from a new node that tries to join a MicroShift cluster, will use the bootstrap kubeconfig file to get limited access to the Kube API server. The Kubelet will then create and retrieve a CSR (Certificate Signing Request). MicroShift's controller manager is configured to automatically approve this new CSR and a new set of assets will be created by Kubelet (certs, kubeconfig).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What makes my CSR applicable for an automatic approval?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If your node has the token push the CSR, it will get automatically approved. (this is my understanding at least).

@mangelajo
Copy link
Contributor

@cgwalters @stlaz thanks for looking at this, I'm having a look to the cluster-machine-approver

@mangelajo
Copy link
Contributor

and yes, @cgwalters MicroShift doesn't manage the OS today, not sure if we would need to look at that at some point.

@mangelajo
Copy link
Contributor

@fzdarsky

@mangelajo
Copy link
Contributor

hmm @cgwalters so, it looks like we were proposing the same mechanism used in OpenShift, but in openshift we have the cluster-machine-approver which makes additional checks based on the open shift machine API (which we wouldn't have on MicroShift), and other Node details.

Probably it makes sense to use the simpler kube-controller-manager to start with, and then in a future it could make sense to have something to extend the CSR made by kubelet with TPM details (for example), and then have an specific MicroShift approver that also can check the CSR based on a CA or again via the TPM hardware module on the masters.

@cgwalters
Copy link
Member

Assuming the bootstrap token is transferred and maintained securely, then I don't think any additional checks against a CA add much value. TPMs on the other hand can be quite powerful, but require investment and aren't available everywhere.

One problem in OCP today is we transfer that token via served Ignition - xref openshift/machine-config-operator#736
(Long story short, I think we want to move all secrets into the bootstrap ignition which is part of the cloud metadata)

@openshift-bot
Copy link

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

@openshift-ci openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 18, 2022
@openshift-bot
Copy link

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

@openshift-ci openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jun 17, 2022
@openshift-ci
Copy link
Contributor

openshift-ci bot commented Jun 17, 2022

@oglok: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@dhellmann
Copy link
Contributor

I think we should close this, since we don't intend to support multi-node.

@openshift-bot
Copy link

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

@dhellmann
Copy link
Contributor

/kind design
/close

We've decided not to support multi-node.

@dhellmann dhellmann closed this Aug 21, 2022
@openshift-ci openshift-ci bot added the kind/design Categorizes issue or PR as related to design. label Aug 21, 2022
@giannisalinetti
Copy link

giannisalinetti commented Sep 2, 2022

Hi team, with full respect, I think that this decision is far from being strategic.
As a Red Hat solution architect I work with customers who are just comparing MicroShift with other already existing similar solutions. Some of them give the chance of setting up a multinode cluster, like for example k3s.

Giving the option of setting up a minimal multi-node cluster (for example adding a sedond HA master or multiple workers) can be a true game changer, even for edge scenarios where application high-availability cannot be managed using existing tools and application HA features to support failover.

For this reason I suggest to reconsider this RFE.

@derekwaynecarr
Copy link
Member

@giannisalinetti can you provide more detail on which scenarios cannot be managed using existing tools and application HA features to support failover? if you desire two control hosts to provide HA, i believe you will have less overall reliability than two single node solutions. if you scale out to additional workers/control hosts, it also demands workflow during upgrade scenarios, and increases significantly the resource consumption that begins to look like traditional standalone openshift. at the moment, we are focusing on ensuring we can meet minimal resource budget for single node scenarios.

Any additional detail you can provide for where microshift would be a fit, but standalone openshift would not, its always appreciated. If its entirely around resource consumption, keep in mind that as you grow clusters, consumption increases at all levels (control hosts, and networking sdn).

@giannisalinetti
Copy link

giannisalinetti commented Sep 6, 2022

@derekwaynecarr I do not have a specific scenario to share now but I can assure you that at least one strategic customer of mine (I can't write the name for privacy reasons but they are one of the main customers in Italy and they are looking forward to the productized version) is comparing MicroShift evolution with other similar products and has already asked me why this feature cannot be included as well as other competitor alternatives like k3s.

@dhellmann
Copy link
Contributor

@giannisalinetti I would be interested in the customer's feedback to @derekwaynecarr's answer. Maybe we can do that privately (email, chat, etc.) so we can have all of the details.

@giannisalinetti
Copy link

@dhellmann @derekwaynecarr sure!
We can discuss it by email if you agree: [email protected]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

kind/design Categorizes issue or PR as related to design. lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants