[thought-experiment] single-node cluster static pod creation #302

deads2k · 2020-05-01T18:14:06Z

/hold

This is only a thought experiment. I started down the path of seeing if I could quickly build this, but the smart way to build requires a few foundational refactors and testing of fake client libraries and fake indexer techniques that haven't been proven.

Let's talk about

whether we're interested enough to invest at least a week before seeing fruit
whether we agree that the approach results in a supportable static
whether we think that the general idea of executing loops in order (with a cycle or two) is maintainable in the long run.

@smarterclayton @mfojtik @sttts @soltysh @hexfusion

openshift-ci-robot · 2020-05-01T18:14:23Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: deads2k

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [deads2k]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

smarterclayton · 2020-05-01T18:17:24Z

enhancements/kube-apiserver/static-pods-for-single-node-via-ignition.md

+We can create a new kind of render command which takes existing inputs *and* config.openshift.io resources.
+Similar to how we built the original disaster recovery for certificates, we can factor the command to run the various
+control loops "in order".
+We can initialize our control loops using fake clients and wire listeners to synthetically update indexers backing fake
+listers.
+This is like we do for unit tests, only wired into the update reactors for the client.
+If we separate the reactive bits of the control loops, the informer watch triggers adn the like, from the data input bits
+(I think this is possible), we can have very high fidelity.
+In the kube-apiserver, the ordering would like this for instance:
+1.  cert-rotation - we need to create certs
+2.  encryption - this would need a special mode to say: just encrypt it right away
+3.  bound tokens - this creates some secrets for us
+4.  static-resources - this creates targets, SAs, and stuff
+5.  config observation - we need to set the operator observed config to be able to generate the final config.
+6.  target config - writes the kube-apiserver configmap
+7.  resource sync - copies bits from A to B
+8.  loop through config observation, target config, resource sync one more time (yeah, cycles)
+9.  revision controller
+
+Now we do a couple neat things:
+1. Export all content from the fake clients to produce resource manifests that will be created bootkube style against
+   the kube-apiserver.
+   Someone will have grown a dependency and we know for sure that the next operator will require input from the previous one.
+2. Wire up the fake clients to our installer command.
+   In theory, this command will create an exact copy of the "normal" kube-apiserver static pod that we create.
+
+This gives leaves supporting only one shape of static pods, which makes support of these static pods much easier.


Is this in theory supportable so during a 4.y release cycle we could define static versions that people can templatize (minimally) with things like on disk certs? I.e. would this "shape" to be roughly supportable with limited flexibility to change, without having to change the existing operator dramatically?

I think our first attempt would be to get people to run this installer with an input that looks exactly like what they would use in a "real" cluster. So we would accept a manifest containing their serving cert and a manifest containing their apiserver.config.openshift.io that says how to use it.

This would allow us...

to have one set of code managing the user input

having a single external interface to the world instead of promising the shape of a static pod

allow the cluster-admin to test/confirm his changes in a real cluster and take those settings as input to producing a single-node cluster.

If we start trying to allow injection of disk certs, our on-disk static pods become an API we need to support.

ashcrow · 2020-05-01T18:35:18Z

enhancements/kube-apiserver/static-pods-for-single-node-via-ignition.md

+
+## Open Questions [optional]
+
+1. Do we even have a use-case for spending the time building this thought experiment?


@mrguitar @imcleod do you have any?

The ILT is working on collecting use cases for single node clusters. Progress has been slow on my side with other commitments but we'll have this put together soon. In the absence of this, people keep wanting to take a single physical server and create a 3 node cluster w/ three VMs on the single node. The overhead of that is insane IMO. Regardless, single node clusters are on the roadmap so please spend the cycles on this.

ashcrow · 2020-05-01T18:36:58Z

/cc @arithx @LorbusChris

cfergeau · 2020-05-04T08:13:15Z

enhancements/kube-apiserver/static-pods-for-single-node-via-ignition.md

+We can initialize our control loops using fake clients and wire listeners to synthetically update indexers backing fake
+listers.
+This is like we do for unit tests, only wired into the update reactors for the client.
+If we separate the reactive bits of the control loops, the informer watch triggers adn the like, from the data input bits


adn the like -> and the like

cfergeau · 2020-05-04T08:13:41Z

enhancements/kube-apiserver/static-pods-for-single-node-via-ignition.md

+This is like we do for unit tests, only wired into the update reactors for the client.
+If we separate the reactive bits of the control loops, the informer watch triggers adn the like, from the data input bits
+(I think this is possible), we can have very high fidelity.
+In the kube-apiserver, the ordering would like this for instance:


would like this -> would be like this

gbraad · 2020-05-04T09:22:40Z

enhancements/kube-apiserver/static-pods-for-single-node-via-ignition.md

+A while back, Seth Jennings had a cool idea for trying to create a single node cluster using ignition.
+The cluster would be non-configurable after "creation", non-upgradable, non-HA.
+The cluster would only have etcd, kube-apiserver, kube-controller-manager, kube-scheduler.
+This is a description of how we could generate supportable static pods.


What would be the outcome? Not really understanding the 'static pod' and 'supportability' in this context. The summary and motivation lack a justification and a why ...

smarterclayton · 2020-05-13T15:27:52Z

enhancements/kube-apiserver/static-pods-for-single-node-via-ignition.md

+
+## Motivation
+
+Documenting a thought experiment about single node clusters.


Single node kubernetes clusters. Not OpenShift clusters.

Right, I'd be curious to know why we care about having a k8s cluster which is somewhat similar to OpenShift cluster but that much. Is the goal to provide a k8s cluster which control plane is managed similarly how the OpenShift control plane or something else that I'm missing here.

Will there be a way to start core operators of OpenShift?

smarterclayton · 2020-06-19T17:05:08Z

I will be referencing this from a broader discussion document that covers an approach to minimal edge clusters using basic control plane, no-operator deployments.

tnozicka

@deads2k What is the benefit of creating a single node cluster that isn't OpenShift, doesn't have operators and isn't configurable? To me this feels like a different product but maybe I just missed the use case.

I like the idea of bootstrapping via static pods and ignition, but to make a real cluster if possible. That would cost more time though.

tnozicka · 2020-07-14T05:32:48Z

enhancements/kube-apiserver/static-pods-for-single-node-via-ignition.md

+
+### Restrictions
+Some things become impractical once we cannot reconfigure the kube-apiserver, they include...
+1. short lifespan of kcm and ksch client certificates - we can no longer rotate these


some would be rotated after expiry as the recovery cert rotation is embeded

openshift-bot · 2020-12-23T22:55:56Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

romfreiman · 2020-12-24T19:13:48Z

/remove-lifecycle stale

openshift-bot · 2021-03-24T23:43:44Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

romfreiman · 2021-03-25T06:17:43Z

/remove-lifecycle stale

openshift-bot · 2021-06-23T07:59:05Z

Issues go stale after 90d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle stale

openshift-bot · 2021-07-23T11:41:41Z

Stale issues rot after 30d of inactivity.

Mark the issue as fresh by commenting /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.
Exclude this issue from closing by commenting /lifecycle frozen.

If this issue is safe to close now please do so with /close.

/lifecycle rotten
/remove-lifecycle stale

openshift-bot · 2021-08-22T14:42:44Z

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

openshift-ci · 2021-08-22T14:43:03Z

@openshift-bot: Closed this PR.

Details

In response to this:

Rotten issues close after 30d of inactivity.

Reopen the issue by commenting /reopen.
Mark the issue as fresh by commenting /remove-lifecycle rotten.
Exclude this issue from closing again by commenting /lifecycle frozen.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

single-node cluster static pod creation

c8556c2

openshift-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label May 1, 2020

openshift-ci-robot requested review from ashcrow and runcom May 1, 2020 18:14

openshift-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label May 1, 2020

smarterclayton reviewed May 1, 2020

View reviewed changes

ashcrow reviewed May 1, 2020

View reviewed changes

openshift-ci-robot requested review from LorbusChris and arithx May 1, 2020 18:36

cfergeau reviewed May 4, 2020

View reviewed changes

gbraad reviewed May 4, 2020

View reviewed changes

smarterclayton reviewed May 13, 2020

View reviewed changes

tnozicka reviewed Jul 14, 2020

View reviewed changes

This was referenced Jul 29, 2020

WIP: experiment with a render command for an all-in-one config openshift/cluster-etcd-operator#410

Closed

WIP: experiment with an all-in-one ignition config installer openshift/installer#3978

Closed

hexfusion mentioned this pull request Sep 23, 2020

*: add InstallType to InstallConfig openshift/installer#4209

Closed

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 23, 2020

openshift-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Dec 24, 2020

openshift-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 24, 2021

openshift-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Mar 25, 2021

openshift-ci bot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jun 23, 2021

openshift-ci bot added lifecycle/rotten Denotes an issue or PR that has aged beyond stale and will be auto-closed. and removed lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. labels Jul 23, 2021

openshift-ci bot closed this Aug 22, 2021


		## Open Questions [optional]

		1. Do we even have a use-case for spending the time building this thought experiment?


		## Motivation

		Documenting a thought experiment about single node clusters.

[thought-experiment] single-node cluster static pod creation #302

[thought-experiment] single-node cluster static pod creation #302

Uh oh!

Conversation

deads2k commented May 1, 2020

Uh oh!

openshift-ci-robot commented May 1, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ashcrow commented May 1, 2020

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

smarterclayton commented Jun 19, 2020

Uh oh!

tnozicka left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

openshift-bot commented Dec 23, 2020

Uh oh!

romfreiman commented Dec 24, 2020

Uh oh!

openshift-bot commented Mar 24, 2021

Uh oh!

romfreiman commented Mar 25, 2021

Uh oh!

openshift-bot commented Jun 23, 2021

Uh oh!

openshift-bot commented Jul 23, 2021

Uh oh!

openshift-bot commented Aug 22, 2021

Uh oh!

openshift-ci bot commented Aug 22, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

12 participants

tnozicka left a comment •

edited

Loading