[RFE] Multi-node Request for Enhacement #498

oglok · 2021-12-13T10:23:19Z

This commit only describes the addition of new compute nodes to an existing
MicroShift cluster. Highly available control plane will be described in later
PRs.

Signed-off-by: Ricardo Noriega [email protected]

This Enhacement proposal addresses part of the #460 epic.

This commit only describes the addition of new compute nodes to an existing MicroShift cluster. Highly available control plane will be described in later PRs. Signed-off-by: Ricardo Noriega <[email protected]>

openshift-ci · 2021-12-13T10:23:21Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please ask for approval from oglok after the PR has been reviewed.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

cgwalters · 2021-12-17T16:45:43Z

There's a lot going on here. Something I am struggling to wrap my head around is that to me, a key point of "OpenShift 4" is that the cluster manages the OS. That isn't handled by MicroShift today (right?).

Now one thing I was asked to comment on here is the relationship to OCP node join. What I would say is basically all of that logic lives in https://github.com/openshift/cluster-machine-approver

OK I just did openshift/cluster-machine-approver#150 - I hope that's helpful.

stlaz · 2021-12-23T11:56:42Z

docs/design/design.md

+* The `bootstrap-kubeconfig` asset must be placed in the new nodes to allow them to join the MicroShift cluster.
+* New nodes will only contain the role `node` using the existing flag `--roles`. (Currently, MicroShift only supports one control plane entity and multiple nodes.)
+* MicroShift will handle certificate rotation for security reasons on the new nodes.
+* As a summary, the Kubelet from a new node that tries to join a MicroShift cluster, will use the bootstrap kubeconfig file to get limited access to the Kube API server. The Kubelet will then create and retrieve a CSR (Certificate Signing Request). MicroShift's controller manager is configured to automatically approve this new CSR and a new set of assets will be created by Kubelet (certs, kubeconfig).


What makes my CSR applicable for an automatic approval?

If your node has the token push the CSR, it will get automatically approved. (this is my understanding at least).

mangelajo · 2022-01-14T09:59:56Z

@cgwalters @stlaz thanks for looking at this, I'm having a look to the cluster-machine-approver

mangelajo · 2022-01-14T10:00:53Z

and yes, @cgwalters MicroShift doesn't manage the OS today, not sure if we would need to look at that at some point.

mangelajo · 2022-01-14T10:01:12Z

@fzdarsky

mangelajo · 2022-01-14T10:08:15Z

hmm @cgwalters so, it looks like we were proposing the same mechanism used in OpenShift, but in openshift we have the cluster-machine-approver which makes additional checks based on the open shift machine API (which we wouldn't have on MicroShift), and other Node details.

Probably it makes sense to use the simpler kube-controller-manager to start with, and then in a future it could make sense to have something to extend the CSR made by kubelet with TPM details (for example), and then have an specific MicroShift approver that also can check the CSR based on a CA or again via the TPM hardware module on the masters.

cgwalters · 2022-01-14T22:57:27Z

Assuming the bootstrap token is transferred and maintained securely, then I don't think any additional checks against a CA add much value. TPMs on the other hand can be quite powerful, but require investment and aren't available everywhere.

One problem in OCP today is we transfer that token via served Ignition - xref openshift/machine-config-operator#736
(Long story short, I think we want to move all secrets into the bootstrap ignition which is part of the cloud metadata)

openshift-bot · 2022-05-18T00:20:57Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2022-06-17T00:48:44Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-ci · 2022-06-17T00:48:58Z

@oglok: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

dhellmann · 2022-06-28T12:47:53Z

I think we should close this, since we don't intend to support multi-node.

openshift-bot · 2022-07-28T13:21:49Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

dhellmann · 2022-08-21T17:12:08Z

/kind design
/close

We've decided not to support multi-node.

giannisalinetti · 2022-09-02T16:20:08Z

Hi team, with full respect, I think that this decision is far from being strategic.
As a Red Hat solution architect I work with customers who are just comparing MicroShift with other already existing similar solutions. Some of them give the chance of setting up a multinode cluster, like for example k3s.

Giving the option of setting up a minimal multi-node cluster (for example adding a sedond HA master or multiple workers) can be a true game changer, even for edge scenarios where application high-availability cannot be managed using existing tools and application HA features to support failover.

For this reason I suggest to reconsider this RFE.

derekwaynecarr · 2022-09-06T13:42:32Z

@giannisalinetti can you provide more detail on which scenarios cannot be managed using existing tools and application HA features to support failover? if you desire two control hosts to provide HA, i believe you will have less overall reliability than two single node solutions. if you scale out to additional workers/control hosts, it also demands workflow during upgrade scenarios, and increases significantly the resource consumption that begins to look like traditional standalone openshift. at the moment, we are focusing on ensuring we can meet minimal resource budget for single node scenarios.

Any additional detail you can provide for where microshift would be a fit, but standalone openshift would not, its always appreciated. If its entirely around resource consumption, keep in mind that as you grow clusters, consumption increases at all levels (control hosts, and networking sdn).

giannisalinetti · 2022-09-06T14:39:59Z

@derekwaynecarr I do not have a specific scenario to share now but I can assure you that at least one strategic customer of mine (I can't write the name for privacy reasons but they are one of the main customers in Italy and they are looking forward to the productized version) is comparing MicroShift evolution with other similar products and has already asked me why this feature cannot be included as well as other competitor alternatives like k3s.

dhellmann · 2022-09-06T14:53:00Z

@giannisalinetti I would be interested in the customer's feedback to @derekwaynecarr's answer. Maybe we can do that privately (email, chat, etc.) so we can have all of the details.

giannisalinetti · 2022-09-06T16:05:10Z

@dhellmann @derekwaynecarr sure!
We can discuss it by email if you agree: [email protected]

Multi-node request for enhacement

8902471

This commit only describes the addition of new compute nodes to an existing MicroShift cluster. Highly available control plane will be described in later PRs. Signed-off-by: Ricardo Noriega <[email protected]>

openshift-ci bot requested review from husky-parul and rootfs December 13, 2021 10:23

mangelajo mentioned this pull request Dec 13, 2021

Allow MicroShift to join new worker nodes #471

Closed

stlaz reviewed Dec 23, 2021

View reviewed changes

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label May 18, 2022

dhellmann closed this Aug 21, 2022

openshift-ci bot added the kind/design Categorizes issue or PR as related to design. label Aug 21, 2022

[RFE] Multi-node Request for Enhacement #498

[RFE] Multi-node Request for Enhacement #498

Uh oh!

Conversation

oglok commented Dec 13, 2021

Uh oh!

openshift-ci bot commented Dec 13, 2021

Uh oh!

cgwalters commented Dec 17, 2021

Uh oh!

stlaz Dec 23, 2021

Choose a reason for hiding this comment

Uh oh!

mangelajo Jan 14, 2022

Choose a reason for hiding this comment

Uh oh!

mangelajo commented Jan 14, 2022

Uh oh!

mangelajo commented Jan 14, 2022

Uh oh!

mangelajo commented Jan 14, 2022

Uh oh!

mangelajo commented Jan 14, 2022

Uh oh!

cgwalters commented Jan 14, 2022

Uh oh!

openshift-bot commented May 18, 2022

Uh oh!

openshift-bot commented Jun 17, 2022

Uh oh!

openshift-ci bot commented Jun 17, 2022

Uh oh!

dhellmann commented Jun 28, 2022

Uh oh!

openshift-bot commented Jul 28, 2022

Uh oh!

dhellmann commented Aug 21, 2022

Uh oh!

giannisalinetti commented Sep 2, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

derekwaynecarr commented Sep 6, 2022

Uh oh!

giannisalinetti commented Sep 6, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dhellmann commented Sep 6, 2022

Uh oh!

giannisalinetti commented Sep 6, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

giannisalinetti commented Sep 2, 2022 •

edited

Loading

giannisalinetti commented Sep 6, 2022 •

edited

Loading