machine pool surge, max unavailable and instance delete by devigned · Pull Request #1105 · kubernetes-sigs/cluster-api-provider-azure

devigned · 2021-01-06T15:21:44Z

What type of PR is this?
/kind feature

What this PR does / why we need it:
This PR introduces API changes to AzureMachinePool to enable users of CAPZ to perform safe, fast rolling upgrades building off of #1067. In this change set MaxSurge and MaxUnavailable fields are introduced to the AzureMachinePool spec.

MaxSurge enables machine pools to over-provision machines, increase the number of machines above the desired count, during an upgrade, which would allow faster upgrades at the cost of Azure compute core quota.
MaxUnavailable enables one to specify how many nodes must be available during a rolling upgrade.
Instance delete enables an individual to delete Azure Machine Pool Machines individually and controller initiated node drain. As part of the instance delete / node drain state tracking for AzureMachinePoolMachines, it is advantageous to track state on these resources individually. That is why in the PR, the AzureMachinePool.Status.Instances array is removed in favor of AzureMachinePoolMachine resources.
Node drain will be completed in a subsequent PR

⚠️ Breaking out AzureMachinePool instances into individual AzureMachinePoolMachine resources will be a breaking change in the experimental API.

This PR is work in progress. Please provide early feedback.

Which issue(s) this PR fixes (optional, in fixes #<issue number>(, fixes #<issue_number>, ...) format, will close the issue(s) when PR gets merged):
Fixes #819

Special notes for your reviewer:

Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.

TODOs:

squashed commits
includes documentation
adds unit tests

Release note:

Introduce maxUnavailable and maxSurge to `AzureMachinePool` and remove `AzureMachinePool.Status.Instances` in favor of representing machine pool machines as individual resources of type `AzureMachinePoolMachine`. This is a breaking change to the experimental API.

k8s-ci-robot · 2021-01-06T15:22:07Z

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
To complete the pull request process, please assign devigned after the PR has been reviewed.
You can assign the PR to them by writing /assign @devigned in a comment when ready.

The full list of commands accepted by this bot can be found here.

Details

Needs approval from an approver in each of these files:

OWNERS

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

devigned · 2021-02-12T14:15:31Z

        "manager",
        cmd = 'mkdir -p .tiltbuild;CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags \'-extldflags "-static"\' -o .tiltbuild/manager',
-        deps = ["api", "cloud", "config", "controllers", "exp", "feature", "pkg", "go.mod", "go.sum", "main.go"]
+        deps = ["api", "azure", "config", "controllers", "exp", "feature", "pkg", "go.mod", "go.sum", "main.go"]


I should probably pull this out into another PR, but what's 1 line on 4k+ 😞

devigned · 2021-02-12T14:22:57Z

+// NewCoalescingReconciler returns a reconcile wrapper that will delay new reconcile.Requests
+// after the cache expiry of the request string key.
+// A successful reconciliation is defined as as one where no error is returned
+func NewCoalescingReconciler(upstream reconcile.Reconciler, cache *CoalescingRequestCache, log logr.Logger) reconcile.Reconciler {


This wraps the AzureMachinePool and AzureMachinePoolMachine reconcilers so that they debounce, the reconcilers rate limit the incoming events so they only do so many within a window of time to not overwhelm Azure API limits.

There is no good way to do this in controller-runtime. I'll intro a proposal over there for middleware or incoming rate limiting.

devigned · 2021-02-12T14:25:36Z

+	// inform the controller that if the parent MachinePool.Spec.Template.Spec.Version field is updated, this image
+	// will be updated with the corresponding default image. If Defaulted is set to false, the controller will not
+	// update the image reference when the MachinePool.Spec.Template.Spec.Version changes.
+	AzureDefaultingImage struct {


Thought we should be explicit about image defaulting. This enables users to use default image versions for K8s versions specified on MachinePools while being safe to specify their own without updates to MachinePools overriding their image reference.

This sounds like a good idea. Should these AzureDefaultingImage changes be part of a different PR instead?

devigned · 2021-02-12T14:35:53Z

+// 2) over-provisioned machines prioritized by out of date models first
+// 3) over-provisioned ready machines
+// 4) ready machines within budget which have out of date models
+func (m *MachinePoolScope) selectMachinesToDelete(machinesByProviderID map[string]infrav1exp.AzureMachinePoolMachine) map[string]infrav1exp.AzureMachinePoolMachine {


This contains the logic for selecting machines to delete when over-provisioned or upgrading. The AzureMachinePool informs the AzureMachinePoolMachine to delete and the AzureMachinePoolMachine is expected to safely remove itself from the pool.

devigned · 2021-02-12T14:38:48Z

+	return ampml.Items, nil
+}
+
+func (m *MachinePoolScope) applyAzureMachinePoolMachines(ctx context.Context) error {


This contains the logic to compare the state of Azure VMSS with the state of AzureMachinePool(Machine)s. If a machine exists in Azure, a AzureMachinePoolMachine is created. If an AzureMachinePoolMachine exists, but doesn't have an Azure counterpart, it is asked to delete. At the end, the upgrade / over-provision state is evaluated and machines can be deleted if the state requires.

devigned · 2021-02-14T14:34:09Z

@nader-ziada and @CecileRobertMichon I believe this PR is now ready for review. I'm so sorry of the enormity of the change set. With that in mind, I'm going to introduce AzureMachinePoolMachine cordon and drain functionality to a subsequent PR.

One design aspect I've vacillated on is how to represent MaxSurge and MaxUnavailable. Right now, they are on the top level spec for AzureMachinePool. Would it be better to use a rollout strategy structure similar to (or the same type as) kubernetes-sigs/cluster-api#4073?

CecileRobertMichon · 2021-02-17T21:40:10Z

@devigned I've been thinking about this and I think we need a proposal to explain what this does, why, what future work is planned and what alternatives were considered (and why they don't work). This is a pretty big PR and it's not obvious why we're doing upgrade this way, so we should document it for the record.

Also, it can serve as developer-facing documentation once the feature is merged. What do you think?

nader-ziada

I went through the changes and it all makes sense, but I have a comment about the structure of the code, the main logic that figures out what machines to create/delete is in the scope instead of in the controller, even setting the conditions, which is different that how we have done things in other places, was this by design?

devigned · 2021-02-22T15:20:18Z

I went through the changes and it all makes sense, but I have a comment about the structure of the code, the main logic that figures out what machines to create/delete is in the scope instead of in the controller, even setting the conditions, which is different that how we have done things in other places, was this by design?

It was a conscious decision. It was a bit of an experiment to see how it would turn out.

Scope should be responsible for updating the K8s state upon closing the scope.
Controller / Reconciler is responsible for creating the Scope, calling services, responding to errors in reconciliation, and closing the Scope.
Services are responsible for manipulating external state and updating the Scope with the external state.

That was a perceptive review. Thank you. WDYT?

k8s-ci-robot · 2021-02-22T16:46:34Z

@devigned: PR needs rebase.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

nader-ziada · 2021-02-22T21:30:43Z

Scope should be responsible for updating the K8s state upon closing the scope.

Controller / Reconciler is responsible for creating the Scope, calling services, responding to errors in reconciliation, and closing the Scope.

Services are responsible for manipulating external state and updating the Scope with the external state.

That was a perceptive review. Thank you. WDYT?

I don't feel strongly about it, but I expected the controller / reconciler to be responsible for also updating the status on the k8s resources, which would include setting the condition.

fiunchinho · 2021-03-05T11:10:21Z

+
+// Reconcile idempotently gets, creates, and updates a scale set.
+func (s *Service) Reconcile(ctx context.Context) error {
+	ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Reconcile")


Suggested change

ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Reconcile")

ctx, span := tele.Tracer().Start(ctx, "scalesetvms.Service.Reconcile")

fiunchinho · 2021-03-05T11:17:30Z

+	}
+}
+
+// Reconcile idempotently gets, creates, and updates a scale set.


This needs to be changed. Maybe something like "Reconcile fetches the latest data about the scaleset instance".

fiunchinho · 2021-03-05T11:17:56Z

+)
+
+type (
+	// ScaleSetVMScope defines the scope interface for a scale sets service.


Suggested change

// ScaleSetVMScope defines the scope interface for a scale sets service.

// ScaleSetVMScope defines the scope interface for a scaleset vms service.

fiunchinho · 2021-03-05T11:18:22Z

+
+// Delete deletes a scaleset instance asynchronously returning a future which encapsulates the long running operation.
+func (s *Service) Delete(ctx context.Context) error {
+	ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Delete")


Suggested change

ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Delete")

ctx, span := tele.Tracer().Start(ctx, "scalesetvms.Service.Delete")

fiunchinho · 2021-03-05T11:44:17Z

 	}
 }
+
+// AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to it's child AzureMachinePoolMachines through


Suggested change

// AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to it's child AzureMachinePoolMachines through

// AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to its child AzureMachinePoolMachines through

fiunchinho · 2021-03-05T11:51:46Z

+		// Defaulted informs the controller that the image reference was defaulted so that it can be updated by changes
+		// to the MachinePool.Spec.Template.Spec.Version field by default.
+		// +optional
+		Defaulted bool `json:"defaulted,omitempty"`


In this case Defaulted means not only that the value was defaulted but that the Image will be managed by the controller. What do you think about calling this field Managed instead? IMHO the word better conveys the behavior. I believe managed is also how we refer to resources managed by the controllers in the code base.

fiunchinho · 2021-03-05T12:00:43Z

+	c, err := ctrl.NewControllerManagedBy(mgr).
+		WithOptions(options.Options).
+		For(&infrav1exp.AzureMachinePoolMachine{}).
+		WithEventFilter(predicates.ResourceNotPaused(log)). // don't queue reconcile if resource is paused


Suggested change

WithEventFilter(predicates.ResourceNotPaused(log)). // don't queue reconcile if resource is paused

WithEventFilter(predicates.ResourceNotPausedAndHasFilterLabel(ctrl.LoggerFrom(ctx), r.WatchFilterValue)).

fiunchinho · 2021-03-05T12:02:25Z

+func NewAzureMachinePoolMachineController(c client.Client, log logr.Logger, recorder record.EventRecorder, reconcileTimeout time.Duration) *AzureMachinePoolMachineController {
+	return &AzureMachinePoolMachineController{
+		Client:            c,
+		Log:               log,
+		Recorder:          recorder,
+		ReconcileTimeout:  reconcileTimeout,
+		reconcilerFactory: newAzureMachinePoolMachineReconciler,
+	}
+}


Suggested change

func NewAzureMachinePoolMachineController(c client.Client, log logr.Logger, recorder record.EventRecorder, reconcileTimeout time.Duration) *AzureMachinePoolMachineController {

return &AzureMachinePoolMachineController{

Client: c,

Log: log,

Recorder: recorder,

ReconcileTimeout: reconcileTimeout,

reconcilerFactory: newAzureMachinePoolMachineReconciler,

}

}

func NewAzureMachinePoolMachineController(c client.Client, log logr.Logger, recorder record.EventRecorder, reconcileTimeout time.Duration, watchFilterValue string) *AzureMachinePoolMachineController {

return &AzureMachinePoolMachineController{

Client: c,

Log: log,

Recorder: recorder,

ReconcileTimeout: reconcileTimeout,

reconcilerFactory: newAzureMachinePoolMachineReconciler,

WatchFilterValue: watchFilterValue,

}

}

fiunchinho · 2021-03-05T12:03:11Z

+		Log               logr.Logger
+		Scheme            *runtime.Scheme
+		Recorder          record.EventRecorder
+		ReconcileTimeout  time.Duration


Suggested change

ReconcileTimeout time.Duration

ReconcileTimeout time.Duration

WatchFilterValue string

fiunchinho · 2021-03-05T12:09:22Z

+		machineScope.SetFailureReason(capierrors.UpdateMachineError)
+		machineScope.SetFailureMessage(errors.Errorf("Azure VM state is %s", state))
+	case infrav1.VMStateDeleting:
+		// for some reason, the


seems like the comment got truncated

fiunchinho · 2021-03-05T14:49:50Z

              location:
                description: Location is the Azure region location e.g. westus2
                type: string
+              maxSurge:


The maxSurge field in the k8s Deployment object accepts an absolute number but also a percentage. Should we support percentage as well? I believe it improves the UX as the user doesn't have to adapt the maxSurge value when scaling up/down their VMSS. If we decide to support percentages, that would mean changing the type of the field.

Oh actually the proposal talks about both absolute numbers and percentages.

fiunchinho · 2021-03-05T15:32:20Z

+	return vm, nil
+}
+
+// DeleteAsync is the operation to delete a virtual machine scale set asynchronously. DeleteAsync sends a DELETE


Suggested change

// DeleteAsync is the operation to delete a virtual machine scale set asynchronously. DeleteAsync sends a DELETE

// DeleteAsync is the operation to delete a virtual machine scale set vm asynchronously. DeleteAsync sends a DELETE

fiunchinho · 2021-03-05T15:33:20Z

+//
+// Parameters:
+//   resourceGroupName - the name of the resource group.
+//   vmssName - the name of the VM scale set to create or update. parameters - the scale set object.


Suggested change

// vmssName - the name of the VM scale set to create or update. parameters - the scale set object.

// vmssName - the name of the VM scale set to create or update. parameters - the scale set object.

// instanceID - the ID of the VM scale set VM.

fiunchinho · 2021-03-05T15:33:34Z

+//   resourceGroupName - the name of the resource group.
+//   vmssName - the name of the VM scale set to create or update. parameters - the scale set object.
+func (ac *azureClient) DeleteAsync(ctx context.Context, resourceGroupName, vmssName, instanceID string) (*infrav1.Future, error) {
+	ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.DeleteAsync")


Suggested change

ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.DeleteAsync")

ctx, span := tele.Tracer().Start(ctx, "scalesetvms.AzureClient.DeleteAsync")

fiunchinho · 2021-03-05T15:35:07Z

+
+// Get retrieves the Virtual Machine Scale Set Virtual Machine
+func (ac *azureClient) Get(ctx context.Context, resourceGroupName, vmssName, instanceID string) (compute.VirtualMachineScaleSetVM, error) {
+	ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.Get")


Suggested change

ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.Get")

ctx, span := tele.Tracer().Start(ctx, "scalesetvms.AzureClient.Get")

devigned · 2021-03-27T11:26:05Z

I'm going to close this PR and open a new PR with an updated branch based off or the proposal in #1191. I think it will help us to apply fresh eyes and a clean slate.

/close

k8s-ci-robot · 2021-03-27T11:26:11Z

@devigned: Closed this PR.

Details

In response to this:

I'm going to close this PR and open a new PR with an updated branch based off or the proposal in #1191. I think it will help us to apply fresh eyes and a clean slate.

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

k8s-ci-robot requested review from awesomenix and juan-lee January 6, 2021 15:22

k8s-ci-robot added area/provider/azure Issues or PRs related to azure provider sig/cluster-lifecycle Categorizes an issue or PR as relevant to SIG Cluster Lifecycle. labels Jan 6, 2021

devigned changed the title ~~[WIP] machine pool surge, max unavailable and instance delete~~ [WIP] ⚠️ machine pool surge, max unavailable and instance delete Jan 6, 2021

devigned changed the title ~~[WIP] ⚠️ machine pool surge, max unavailable and instance delete~~ [WIP] machine pool surge, max unavailable and instance delete Jan 6, 2021

devigned force-pushed the surge branch from f7f74f7 to d291b36 Compare January 23, 2021 17:21

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Jan 23, 2021

devigned force-pushed the surge branch from d291b36 to 54ba516 Compare February 3, 2021 16:27

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 4, 2021

devigned force-pushed the surge branch from ab69c3b to aa1e0cd Compare February 4, 2021 21:06

k8s-ci-robot removed the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 4, 2021

devigned force-pushed the surge branch 4 times, most recently from 4a05556 to 25ae039 Compare February 4, 2021 22:32

devigned force-pushed the surge branch 2 times, most recently from bb0c14e to 07e5a09 Compare February 12, 2021 14:02

devigned commented Feb 12, 2021

View reviewed changes

devigned mentioned this pull request Feb 17, 2021

Windows VMSS E2E flakes #1182

Closed

nader-ziada reviewed Feb 19, 2021

View reviewed changes

k8s-ci-robot added the needs-rebase Indicates a PR cannot be merged because it has merge conflicts with HEAD. label Feb 22, 2021

devigned mentioned this pull request Feb 23, 2021

add azure machine pool machine proposal #1191

Merged

3 tasks

fiunchinho reviewed Mar 5, 2021

View reviewed changes

CecileRobertMichon mentioned this pull request Mar 5, 2021

🐛 Ensure VM and VMSS extensions are applied once #1217

Merged

3 tasks

k8s-ci-robot closed this Mar 27, 2021

devigned mentioned this pull request Apr 22, 2021

add Azure machine pool rolling upgrades with MaxSurge, MaxUnavailable and DeletePolicy #1332

Merged

3 tasks

devigned deleted the surge branch June 10, 2021 15:38

	ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Reconcile")
	ctx, span := tele.Tracer().Start(ctx, "scalesetvms.Service.Reconcile")

	// ScaleSetVMScope defines the scope interface for a scale sets service.
	// ScaleSetVMScope defines the scope interface for a scaleset vms service.

	// AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to it's child AzureMachinePoolMachines through
	// AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to its child AzureMachinePoolMachines through

	WithEventFilter(predicates.ResourceNotPaused(log)). // don't queue reconcile if resource is paused
	WithEventFilter(predicates.ResourceNotPausedAndHasFilterLabel(ctrl.LoggerFrom(ctx), r.WatchFilterValue)).

	ReconcileTimeout time.Duration
	ReconcileTimeout time.Duration
	WatchFilterValue string

	// DeleteAsync is the operation to delete a virtual machine scale set asynchronously. DeleteAsync sends a DELETE
	// DeleteAsync is the operation to delete a virtual machine scale set vm asynchronously. DeleteAsync sends a DELETE

	// vmssName - the name of the VM scale set to create or update. parameters - the scale set object.
	// vmssName - the name of the VM scale set to create or update. parameters - the scale set object.
	// instanceID - the ID of the VM scale set VM.

Conversation

devigned commented Jan 6, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

k8s-ci-robot commented Jan 6, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devigned commented Feb 14, 2021

Uh oh!

CecileRobertMichon commented Feb 17, 2021

Uh oh!

nader-ziada left a comment

Choose a reason for hiding this comment

Uh oh!

devigned commented Feb 22, 2021

Uh oh!

k8s-ci-robot commented Feb 22, 2021

Uh oh!

nader-ziada commented Feb 22, 2021

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devigned commented Mar 27, 2021

Uh oh!

k8s-ci-robot commented Mar 27, 2021

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

devigned commented Jan 6, 2021 •

edited

Loading