✨WIP: Refactor CreateInstance and CreateBastion #1153

mdbooth · 2022-02-24T17:10:53Z

What this PR does / why we need it:

I have moved the following to #1191

The primary purpose of this this PR is to cleanup the interface of compute.CreateInstance and make the separation of concerns between the machine controller (for Machines), the cluster controller (for the Bastion), and compute (for actual server creation) better defined.

The Bastion host is represented in the cluster spec as an Instance. An OpenStackMachine is represented by context in OpenStackMachine as well as OpenStackCluster. So while they both create a server in the same way, they both source the server's parameters in slightly different ways using different source objects.

At some point in history we also used Instance as the intermediate representation for an OpenStackMachine. That is, we combined parameters from an OpenStackMachine and an OpenStackCluster into an Instance, then passed that to CreateInstance. Instance is not ideal for this purpose as it is both Spec and Status. It contains fields which cannot be used as input parameters to CreateInstance. Therefore we refactored this into the internal-only InstanceSpec, which contains only spec fields.

This refactor takes this further. Firstly we ensure that the code previously contained directly in CreateBastion and CreateInstance is strictly data transformation and therefore very cheap. Anything expensive moves into the new ReconcileInstance. Secondly we move this code to the controller where it is relevant, and add new unit tests covering the transformation. This also reduces the scope of the former CreateInstance unit tests, making them simpler.

The remainder in this PR which will need to be reworked is:

In refactoring bastion creation, we also take the opportunity to make all bastion errors non-fatal to the cluster. We also ensure we update the cluster status to Ready before creating the Bastion, which means that control plane creation can run in parallel with bastion creation.

Special notes for your reviewer:
I don't intend to squash these commits. They are intended to be independent logical steps. You may find it easier to review this PR by commit rather than as a single change.

/hold

netlify · 2022-02-24T17:11:00Z

✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Name	Link
🔨 Latest commit	`34cbf11`
🔍 Latest deploy log	https://app.netlify.com/sites/kubernetes-sigs-cluster-api-openstack/deploys/624450c36c71500009be6186
😎 Deploy Preview	https://deploy-preview-1153--kubernetes-sigs-cluster-api-openstack.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify site settings.

k8s-ci-robot · 2022-02-24T17:11:12Z

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: mdbooth

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Details

Needs approval from an approver in each of these files:

~~OWNERS~~ [mdbooth]

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

seanschneeweiss

This looks very, very promising. Great work. Probably it would be good for me to review this a second time, just to understand the changes even better ;)

controllers/openstackcluster_controller.go

controllers/openstackmachine_controller.go

pkg/cloud/services/compute/instance.go

pkg/cloud/services/networking/securitygroups.go

mdbooth · 2022-03-03T13:09:40Z

@seanschneeweiss Thanks for an excellent review! It's going to take me a while to go through it.

chrischdi · 2022-03-03T13:53:17Z

controllers/openstackcluster_controller.go

 		return reconcile.Result{}, err
 	}
+	if !deleted {
+		return reconcile.Result{RequeueAfter: 10 * time.Second}, nil


Won't it make sense here to use a exponential backoff (return reconcile.Result{}, nil) to not "DDoS" the API?

I've updated everything to use the controller's rate limiter. I've also customised the rate limiter to have a higher base delay. Without writing my own exponential backoff rate limiter (which I wasn't keen on doing) we're stuck with an exponent of 2, so I've set the base delay at 2 seconds. We will back off at intervals of 2s, 4s, 8s... up to a max of 1000s.

chrischdi · 2022-03-03T13:54:40Z

controllers/openstackcluster_controller.go

+	if !openStackCluster.Status.Ready {
+		openStackCluster.Status.Ready = true
+
+		// If we're setting Ready, return early to update status and


That's a great move :-)

chrischdi · 2022-03-03T13:55:37Z

controllers/openstackmachine_controller.go

 	if instanceStatus == nil {
-		handleUpdateMachineError(logger, openStackMachine, errors.New("OpenStack instance cannot be found"))
-		return ctrl.Result{}, nil
+		return ctrl.Result{RequeueAfter: 10 * time.Second}, nil


Maybe, for exp. Backoff:

Suggested change

return ctrl.Result{RequeueAfter: 10 * time.Second}, nil

return ctrl.Result{}, nil

pkg/cloud/services/networking/securitygroups.go

chrischdi · 2022-03-03T13:59:18Z

Great improvements :-)

mdbooth · 2022-03-03T14:35:56Z

@chrischdi On exponential backoff, yes absolutely! I'd added that as a placeholder for something cleverer and didn't change it. Must not commit this series without addressing that, so adding a

/hold

In the meantime if controller-runtime does exp backoff it's probably good enough to just use that.

mdbooth · 2022-03-03T16:09:46Z

The default backoff behaviour of controller-runtime is set here:

https://github.com/kubernetes-sigs/controller-runtime/blob/eb292e5d9bd6c59663fc6777ae2b99f901f386e4/pkg/controller/controller.go#L120-L122

and defined here:

https://github.com/kubernetes/client-go/blob/eb103e0abf6218751302bf984d01ac45aae2ac6b/util/workqueue/default_rate_limiters.go#L39-L45

To summarise, each individual object has a 5ms backoff growing with an exponent of 2 up to a maximum of 1000 seconds. There is also a global limit of 10 reconciles per second. So retries will happen with a backoff of:

0.005s
0.025s
0.125s
0.625s
3.125s
15.625s
78.125s
390.625s
1000s
...

I don't think this is great for us in practise, as most backoffs will be resource creation waits. We would likely almost always do the first 6 retries before success. We should probably customise it. e.g. a 10s backoff with an exponent of 1.5 up to 1 hour would look like:

10s
15s
22.5s
33.75s
50.625s
75.9375s
113.90625s
170.859375s
256.2890625s
384.4335938s
576.6503906s
864.9755859s
1297.463379s
1946.195068s
2919.292603s
3600s

I'd also be inclined to retain the global 10 reconciles/s limit.

However, for now I'm happy to use the default. We can revisit this.

seanschneeweiss

How would you feel about setting openStackMachine.Status.Ready = false after the following positions:

cluster-api-provider-openstack/controllers/openstackmachine_controller.go

Line 338 in 27817ed

case infrav1.InstanceStateError:

cluster-api-provider-openstack/controllers/openstackmachine_controller.go

Line 346 in 27817ed

default:

chrischdi · 2022-03-14T07:28:38Z

main.go

 }

 func setupReconcilers(ctx context.Context, mgr ctrl.Manager) {
+	// Based on workqueue.DefaultControllerRateLimiter with a higher baseDelay


Maybe to discuss this again here:

We not only have failures due to OpenStack API Calls.
I don't know if we really want to customize that.

When thinking about the "Happy Path" for CAPO I agree with you, it does not make sense.

But if we consider the existence of other controllers acting on the same resources as CAPO, the default is reasonable for doing retries to update the CR objects.

E.g. the reconcile fails at the end (when CAPO wants to apply its patch) because someone in the meantime updated the CR and by that increased the metadata.resourceVersion (maybe even only adding a label), this would result in using the increased backoff instead of using the default.

Also in the use-case of the PR / the discussion:

if !deleted { return reconcile.Result{Requeue: true}, nil }

Would this even make use the rate-limiting or result in an immediate retry (making the custom rate-limited obsolete for this use-case) because it is not returning an error?

Maybe better suited to post that on the PR 😄

I did not find that any other cloud provider does use something other than the default for their controller: https://cs.k8s.io/?q=RateLimiter&i=nope&files=&excludeFiles=vendor%2F.*&repos=kubernetes-sigs/cluster-api,kubernetes-sigs/cluster-api-operator,kubernetes-sigs/cluster-api-provider-aws,kubernetes-sigs/cluster-api-provider-azure,kubernetes-sigs/cluster-api-provider-digitalocean,kubernetes-sigs/cluster-api-provider-gcp,kubernetes-sigs/cluster-api-provider-ibmcloud,kubernetes-sigs/cluster-api-provider-kubemark,kubernetes-sigs/cluster-api-provider-kubevirt,kubernetes-sigs/cluster-api-provider-nested,kubernetes-sigs/cluster-api-provider-openstack,kubernetes-sigs/cluster-api-provider-packet,kubernetes-sigs/cluster-api-provider-vsphere

Note: I'm ok with the change (50:50), just wanted to raise some awareness about that :-)

Couple of things.

Firstly I agree that we should consider this carefully. I need to rebase this PR anyway, and when I do I'm going to remove the rate limiting. I'll propose it again as a separate PR and we can examine it in more detail there. I think this will work acceptably well without modifying the rate limiter, although with a bit more load on the OpenStack API than is necessary.

Secondly, the controller-runtime documentation on this feature sucks, so I resorted to RTFS 🙄 AFAICT the rate limiter is used in exactly 2 use cases:

Reconcile returns error

Reconcile returns Requeue: true

Specifically, AFAICT watches are not added to the queue with the rate limiting interface, which means they will be enqueued immediately. That is, if a controller touches a machine object, the machine controller will reconcile that machine object immediately without rate limiting. We could actually verify this manually by setting it to something ridiculously large and checking that Reconcile is called immediately, but lets do that on the new PR.

Also, if Reconcile returns RequeueAfter, that also doesn't use the rate limiter: it uses the value passed in.

So the rate limiter is only ever used when invoked by code in our controller returning one of the 2 values which uses it.

MPV · 2022-03-14T09:09:05Z

Will this change mean that a cluster that (for whatever reason) can't reconcile, can still reconcile its bastion?
Imagine in a case where you have a cluster created without a bastion, then the cluster fails, and you want CAPO to add a bastion so you can reach the cluster for troubleshooting. Curious about your thoughts on this.

mdbooth · 2022-03-14T10:20:46Z

How would you feel about setting openStackMachine.Status.Ready = false after the following positions:

cluster-api-provider-openstack/controllers/openstackmachine_controller.go

Line 338 in 27817ed

case infrav1.InstanceStateError:

cluster-api-provider-openstack/controllers/openstackmachine_controller.go

Line 346 in 27817ed

default:

I thought Ready was a one-way gate, i.e. we're not allowed to unset it once set? If that's not correct, though, I'm very interested in that.

I actually have a draft doc which I need to clean up and push somewhere with my thoughts on the failed state. I do think we need a status marker for non-terminal failure. These would be examples of non-terminal failure. If we could use Ready that way it would be very interesting, but without knowing for sure I suspect it would break assumptions. I think this is a CAPI discussion.

mdbooth · 2022-03-14T10:26:27Z

Will this change mean that a cluster that (for whatever reason) can't reconcile, can still reconcile its bastion? Imagine in a case where you have a cluster created without a bastion, then the cluster fails, and you want CAPO to add a bastion so you can reach the cluster for troubleshooting. Curious about your thoughts on this.

No. It won't start reconciling the bastion until the cluster is up. If the cluster never comes up it will never create the bastion.

There's no fundamental reason we couldn't do this, though, but it would take a fair amount of refactoring. This is essentially the same direction I'm trying to take the reconciliation of instances, i.e. independent resources can be reconciled concurrently.

MPV · 2022-03-16T09:11:14Z

Will this change mean that a cluster that (for whatever reason) can't reconcile, can still reconcile its bastion? Imagine in a case where you have a cluster created without a bastion, then the cluster fails, and you want CAPO to add a bastion so you can reach the cluster for troubleshooting. Curious about your thoughts on this.

No. It won't start reconciling the bastion until the cluster is up. If the cluster never comes up it will never create the bastion.

I'm referring to another scenario though, where a cluster has already been created successfully (but configured without a bastion). Then one might decide to add a bastion (for easier troubleshooting). But if, at that point, for whatever reason, the cluster can't reconcile, would the bastion still reconcile successfully on its own (so its added and we can troubleshoot the cluster)?

There's no fundamental reason we couldn't do this, though, but it would take a fair amount of refactoring. This is essentially the same direction I'm trying to take the reconciliation of instances, i.e. independent resources can be reconciled concurrently.

Sounds promising. 🙏

mdbooth · 2022-03-16T09:34:22Z

Will this change mean that a cluster that (for whatever reason) can't reconcile, can still reconcile its bastion? Imagine in a case where you have a cluster created without a bastion, then the cluster fails, and you want CAPO to add a bastion so you can reach the cluster for troubleshooting. Curious about your thoughts on this.

No. It won't start reconciling the bastion until the cluster is up. If the cluster never comes up it will never create the bastion.

I'm referring to another scenario though, where a cluster has already been created successfully (but configured without a bastion). Then one might decide to add a bastion (for easier troubleshooting). But if, at that point, for whatever reason, the cluster can't reconcile, would the bastion still reconcile successfully on its own (so its added and we can troubleshoot the cluster)?

No, this isn't going to help in that case. That said, the bastion requires most of the cluster infrastructure to be up already, so I'm not sure how early we could usefully move it in the cluster reconciliation process anyway.

What's the use case, btw? I'd have thought that the bastion was most useful for debugging machines rather than the cluster? If we're failing to create networks/routers/security groups/load balancers that's surely all going to be debugged via object status/events/OpenStack API. Does the bastion help with any of that?

dulek · 2022-03-24T11:42:02Z

I tested this on my env and it seems to work just fine.

dulek · 2022-03-18T16:30:20Z

controllers/openstackcluster_controller.go

 	openStackCluster.Status.BastionSecurityGroup = nil

-	return nil
+	return true, nil


Looks like deleted is as simple as if err == nil? I guess you just want to be explicit?

Currently yes, but for the same reason as updating reconcile instance I want to be able to return an incomplete result. i.e. There was no error, but delete is not finished yet.

Common network and security group handling between CreateBastion() and CreateInstance(). A principal advantage of this refactor is that it makes the marshalling of OpenStackMachineSpec and Instance respectively into an InstanceSpec a cheap operation which makes no API calls.

Allow CreateBastion and CreateInstance to be called as reconcilers. Specifically they become idempotent and can additionally return a 'not yet complete' status by returning a nil InstanceStatus. This new status is handled in the controller by rescheduling reconciliation. To reflect this change we rename the methods to ReconcileBastion and ReconcileInstance. In making this change we also make some opportunistic changes to reconciliation of the Bastion: * Bastion reconciliation errors no longer put the cluster in a failed state * We mark the cluster Ready before creating the Bastion * We handle the case where the Bastion floating IP is already associated Apart from the Bastion changes, this is almost entirely code motion as can be seen in the unit tests. While we permit the Reconcile methods to return an incomplete state, nothing yet returns it. The only change in the unit tests is due to moving the GetInstanceStatusByName check which is common to the bastion and machines into reconcileInstance.

Refactor instance creation in machine controller and cluster controller (for the bastion) to call compute.ReconcileInstance() with an InstanceSpec.

k8s-ci-robot · 2022-04-02T06:34:52Z

@mdbooth: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

seanschneeweiss · 2022-04-06T15:53:38Z

I thought Ready was a one-way gate, i.e. we're not allowed to unset it once set? If that's not correct, though, I'm very interested in that.

I actually have a draft doc which I need to clean up and push somewhere with my thoughts on the failed state. I do think we need a status marker for non-terminal failure. These would be examples of non-terminal failure. If we could use Ready that way it would be very interesting, but without knowing for sure I suspect it would break assumptions. I think this is a CAPI discussion.

@mdbooth I had a look at CAPZ, CAPA, and CAPIBM to do so too.
CAPG doesn't change the ready status.

From the CRD and cluster-api docs I don't really identify whether to see this as a one-way gate or not. Definitely a question for CAPI. Personally I'd like to use it for non-terminal failure.

mdbooth · 2022-06-21T14:18:44Z

I'll create a new PR when I'm working on this again.

k8s-ci-robot added do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Feb 24, 2022

k8s-ci-robot added the size/XXL Denotes a PR that changes 1000+ lines, ignoring generated files. label Feb 24, 2022

k8s-ci-robot requested a review from jichenjc February 24, 2022 17:11

k8s-ci-robot requested a review from prankul88 February 24, 2022 17:11

k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Feb 24, 2022

mdbooth mentioned this pull request Mar 1, 2022

Floating IPs being created when the kubeapi LB status is provisioning_status: ERROR #1159

Closed

seanschneeweiss reviewed Mar 2, 2022

View reviewed changes

chrischdi reviewed Mar 3, 2022

View reviewed changes

mdbooth force-pushed the bastion branch 2 times, most recently from a5a8964 to ee6ed57 Compare March 11, 2022 10:37

seanschneeweiss reviewed Mar 11, 2022

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 12, 2022

chrischdi reviewed Mar 14, 2022

View reviewed changes

mdbooth force-pushed the bastion branch from ee6ed57 to 0894aa0 Compare March 14, 2022 17:26

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 14, 2022

mdbooth mentioned this pull request Mar 18, 2022

"Not allowed to use specific floating IP" makes Cluster enter failure state and stop reconciling (despite later fixing OpenStackCluster) #1131

Closed

dulek mentioned this pull request Mar 24, 2022

Bug 2054701: Convert Machines directly to InstanceSpec openshift/machine-api-provider-openstack#33

Merged

dulek reviewed Mar 24, 2022

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 28, 2022

mdbooth added 5 commits March 30, 2022 13:44

Don't validate security group passed by uuid

7d9f21a

Make InstanceSpec the canonical struct for instance creation

5a82b92

Refactor instance creation in machine controller and cluster controller (for the bastion) to call compute.ReconcileInstance() with an InstanceSpec.

Update DeleteInstance to use InstanceSpec

34cbf11

mdbooth force-pushed the bastion branch from 0894aa0 to 34cbf11 Compare March 30, 2022 12:44

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Mar 30, 2022

mdbooth marked this pull request as draft April 1, 2022 05:35

k8s-ci-robot added the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Apr 1, 2022

mdbooth changed the title ~~✨ Refactor CreateInstance and CreateBastion~~ ✨WIP: Refactor CreateInstance and CreateBastion Apr 1, 2022

mdbooth mentioned this pull request Apr 1, 2022

✨Refactor CreateInstance and CreateBastion #1191

Merged

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Apr 2, 2022

mdbooth closed this Jun 21, 2022

mdbooth deleted the bastion branch December 13, 2022 10:01

	return ctrl.Result{RequeueAfter: 10 * time.Second}, nil
	return ctrl.Result{}, nil

✨WIP: Refactor CreateInstance and CreateBastion #1153

✨WIP: Refactor CreateInstance and CreateBastion #1153

Uh oh!

Conversation

mdbooth commented Feb 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

netlify bot commented Feb 24, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for kubernetes-sigs-cluster-api-openstack ready!

Uh oh!

k8s-ci-robot commented Feb 24, 2022

Uh oh!

seanschneeweiss left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

mdbooth commented Mar 3, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

chrischdi commented Mar 3, 2022

Uh oh!

mdbooth commented Mar 3, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

mdbooth commented Mar 3, 2022

Uh oh!

seanschneeweiss left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

MPV commented Mar 14, 2022

Uh oh!

mdbooth commented Mar 14, 2022

Uh oh!

mdbooth commented Mar 14, 2022

Uh oh!

MPV commented Mar 16, 2022

Uh oh!

mdbooth commented Mar 16, 2022

Uh oh!

dulek commented Mar 24, 2022

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

k8s-ci-robot commented Apr 2, 2022

Uh oh!

seanschneeweiss commented Apr 6, 2022

Uh oh!

mdbooth commented Jun 21, 2022

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

mdbooth commented Feb 24, 2022 •

edited

Loading

netlify bot commented Feb 24, 2022 •

edited

Loading

seanschneeweiss left a comment •

edited

Loading

mdbooth commented Mar 3, 2022 •

edited

Loading