machine pool surge, max unavailable and instance delete#1105
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here. DetailsNeeds approval from an approver in each of these files:Approvers can indicate their approval by writing |
4a05556 to
25ae039
Compare
bb0c14e to
07e5a09
Compare
| "manager", | ||
| cmd = 'mkdir -p .tiltbuild;CGO_ENABLED=0 GOOS=linux GOARCH=amd64 go build -ldflags \'-extldflags "-static"\' -o .tiltbuild/manager', | ||
| deps = ["api", "cloud", "config", "controllers", "exp", "feature", "pkg", "go.mod", "go.sum", "main.go"] | ||
| deps = ["api", "azure", "config", "controllers", "exp", "feature", "pkg", "go.mod", "go.sum", "main.go"] |
There was a problem hiding this comment.
I should probably pull this out into another PR, but what's 1 line on 4k+ 😞
| // NewCoalescingReconciler returns a reconcile wrapper that will delay new reconcile.Requests | ||
| // after the cache expiry of the request string key. | ||
| // A successful reconciliation is defined as as one where no error is returned | ||
| func NewCoalescingReconciler(upstream reconcile.Reconciler, cache *CoalescingRequestCache, log logr.Logger) reconcile.Reconciler { |
There was a problem hiding this comment.
This wraps the AzureMachinePool and AzureMachinePoolMachine reconcilers so that they debounce, the reconcilers rate limit the incoming events so they only do so many within a window of time to not overwhelm Azure API limits.
There is no good way to do this in controller-runtime. I'll intro a proposal over there for middleware or incoming rate limiting.
| // inform the controller that if the parent MachinePool.Spec.Template.Spec.Version field is updated, this image | ||
| // will be updated with the corresponding default image. If Defaulted is set to false, the controller will not | ||
| // update the image reference when the MachinePool.Spec.Template.Spec.Version changes. | ||
| AzureDefaultingImage struct { |
There was a problem hiding this comment.
Thought we should be explicit about image defaulting. This enables users to use default image versions for K8s versions specified on MachinePools while being safe to specify their own without updates to MachinePools overriding their image reference.
There was a problem hiding this comment.
This sounds like a good idea. Should these AzureDefaultingImage changes be part of a different PR instead?
| // 2) over-provisioned machines prioritized by out of date models first | ||
| // 3) over-provisioned ready machines | ||
| // 4) ready machines within budget which have out of date models | ||
| func (m *MachinePoolScope) selectMachinesToDelete(machinesByProviderID map[string]infrav1exp.AzureMachinePoolMachine) map[string]infrav1exp.AzureMachinePoolMachine { |
There was a problem hiding this comment.
This contains the logic for selecting machines to delete when over-provisioned or upgrading. The AzureMachinePool informs the AzureMachinePoolMachine to delete and the AzureMachinePoolMachine is expected to safely remove itself from the pool.
| return ampml.Items, nil | ||
| } | ||
|
|
||
| func (m *MachinePoolScope) applyAzureMachinePoolMachines(ctx context.Context) error { |
There was a problem hiding this comment.
This contains the logic to compare the state of Azure VMSS with the state of AzureMachinePool(Machine)s. If a machine exists in Azure, a AzureMachinePoolMachine is created. If an AzureMachinePoolMachine exists, but doesn't have an Azure counterpart, it is asked to delete. At the end, the upgrade / over-provision state is evaluated and machines can be deleted if the state requires.
|
@nader-ziada and @CecileRobertMichon I believe this PR is now ready for review. I'm so sorry of the enormity of the change set. With that in mind, I'm going to introduce One design aspect I've vacillated on is how to represent |
|
@devigned I've been thinking about this and I think we need a proposal to explain what this does, why, what future work is planned and what alternatives were considered (and why they don't work). This is a pretty big PR and it's not obvious why we're doing upgrade this way, so we should document it for the record. Also, it can serve as developer-facing documentation once the feature is merged. What do you think? |
nader-ziada
left a comment
There was a problem hiding this comment.
I went through the changes and it all makes sense, but I have a comment about the structure of the code, the main logic that figures out what machines to create/delete is in the scope instead of in the controller, even setting the conditions, which is different that how we have done things in other places, was this by design?
It was a conscious decision. It was a bit of an experiment to see how it would turn out.
That was a perceptive review. Thank you. WDYT? |
|
@devigned: PR needs rebase. DetailsInstructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
I don't feel strongly about it, but I expected the controller / reconciler to be responsible for also updating the status on the k8s resources, which would include setting the condition. |
|
|
||
| // Reconcile idempotently gets, creates, and updates a scale set. | ||
| func (s *Service) Reconcile(ctx context.Context) error { | ||
| ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Reconcile") |
There was a problem hiding this comment.
| ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Reconcile") | |
| ctx, span := tele.Tracer().Start(ctx, "scalesetvms.Service.Reconcile") |
| } | ||
| } | ||
|
|
||
| // Reconcile idempotently gets, creates, and updates a scale set. |
There was a problem hiding this comment.
This needs to be changed. Maybe something like "Reconcile fetches the latest data about the scaleset instance".
| ) | ||
|
|
||
| type ( | ||
| // ScaleSetVMScope defines the scope interface for a scale sets service. |
There was a problem hiding this comment.
| // ScaleSetVMScope defines the scope interface for a scale sets service. | |
| // ScaleSetVMScope defines the scope interface for a scaleset vms service. |
|
|
||
| // Delete deletes a scaleset instance asynchronously returning a future which encapsulates the long running operation. | ||
| func (s *Service) Delete(ctx context.Context) error { | ||
| ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Delete") |
There was a problem hiding this comment.
| ctx, span := tele.Tracer().Start(ctx, "scalesets.Service.Delete") | |
| ctx, span := tele.Tracer().Start(ctx, "scalesetvms.Service.Delete") |
| } | ||
| } | ||
|
|
||
| // AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to it's child AzureMachinePoolMachines through |
There was a problem hiding this comment.
| // AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to it's child AzureMachinePoolMachines through | |
| // AzureMachinePoolToAzureMachinePoolMachines maps an AzureMachinePool to its child AzureMachinePoolMachines through |
| // Defaulted informs the controller that the image reference was defaulted so that it can be updated by changes | ||
| // to the MachinePool.Spec.Template.Spec.Version field by default. | ||
| // +optional | ||
| Defaulted bool `json:"defaulted,omitempty"` |
There was a problem hiding this comment.
In this case Defaulted means not only that the value was defaulted but that the Image will be managed by the controller. What do you think about calling this field Managed instead? IMHO the word better conveys the behavior. I believe managed is also how we refer to resources managed by the controllers in the code base.
| c, err := ctrl.NewControllerManagedBy(mgr). | ||
| WithOptions(options.Options). | ||
| For(&infrav1exp.AzureMachinePoolMachine{}). | ||
| WithEventFilter(predicates.ResourceNotPaused(log)). // don't queue reconcile if resource is paused |
There was a problem hiding this comment.
| WithEventFilter(predicates.ResourceNotPaused(log)). // don't queue reconcile if resource is paused | |
| WithEventFilter(predicates.ResourceNotPausedAndHasFilterLabel(ctrl.LoggerFrom(ctx), r.WatchFilterValue)). |
| func NewAzureMachinePoolMachineController(c client.Client, log logr.Logger, recorder record.EventRecorder, reconcileTimeout time.Duration) *AzureMachinePoolMachineController { | ||
| return &AzureMachinePoolMachineController{ | ||
| Client: c, | ||
| Log: log, | ||
| Recorder: recorder, | ||
| ReconcileTimeout: reconcileTimeout, | ||
| reconcilerFactory: newAzureMachinePoolMachineReconciler, | ||
| } | ||
| } |
There was a problem hiding this comment.
| func NewAzureMachinePoolMachineController(c client.Client, log logr.Logger, recorder record.EventRecorder, reconcileTimeout time.Duration) *AzureMachinePoolMachineController { | |
| return &AzureMachinePoolMachineController{ | |
| Client: c, | |
| Log: log, | |
| Recorder: recorder, | |
| ReconcileTimeout: reconcileTimeout, | |
| reconcilerFactory: newAzureMachinePoolMachineReconciler, | |
| } | |
| } | |
| func NewAzureMachinePoolMachineController(c client.Client, log logr.Logger, recorder record.EventRecorder, reconcileTimeout time.Duration, watchFilterValue string) *AzureMachinePoolMachineController { | |
| return &AzureMachinePoolMachineController{ | |
| Client: c, | |
| Log: log, | |
| Recorder: recorder, | |
| ReconcileTimeout: reconcileTimeout, | |
| reconcilerFactory: newAzureMachinePoolMachineReconciler, | |
| WatchFilterValue: watchFilterValue, | |
| } | |
| } |
| Log logr.Logger | ||
| Scheme *runtime.Scheme | ||
| Recorder record.EventRecorder | ||
| ReconcileTimeout time.Duration |
There was a problem hiding this comment.
| ReconcileTimeout time.Duration | |
| ReconcileTimeout time.Duration | |
| WatchFilterValue string |
| machineScope.SetFailureReason(capierrors.UpdateMachineError) | ||
| machineScope.SetFailureMessage(errors.Errorf("Azure VM state is %s", state)) | ||
| case infrav1.VMStateDeleting: | ||
| // for some reason, the |
There was a problem hiding this comment.
seems like the comment got truncated
| location: | ||
| description: Location is the Azure region location e.g. westus2 | ||
| type: string | ||
| maxSurge: |
There was a problem hiding this comment.
The maxSurge field in the k8s Deployment object accepts an absolute number but also a percentage. Should we support percentage as well? I believe it improves the UX as the user doesn't have to adapt the maxSurge value when scaling up/down their VMSS. If we decide to support percentages, that would mean changing the type of the field.
There was a problem hiding this comment.
| return vm, nil | ||
| } | ||
|
|
||
| // DeleteAsync is the operation to delete a virtual machine scale set asynchronously. DeleteAsync sends a DELETE |
There was a problem hiding this comment.
| // DeleteAsync is the operation to delete a virtual machine scale set asynchronously. DeleteAsync sends a DELETE | |
| // DeleteAsync is the operation to delete a virtual machine scale set vm asynchronously. DeleteAsync sends a DELETE |
| // | ||
| // Parameters: | ||
| // resourceGroupName - the name of the resource group. | ||
| // vmssName - the name of the VM scale set to create or update. parameters - the scale set object. |
There was a problem hiding this comment.
| // vmssName - the name of the VM scale set to create or update. parameters - the scale set object. | |
| // vmssName - the name of the VM scale set to create or update. parameters - the scale set object. | |
| // instanceID - the ID of the VM scale set VM. |
| // resourceGroupName - the name of the resource group. | ||
| // vmssName - the name of the VM scale set to create or update. parameters - the scale set object. | ||
| func (ac *azureClient) DeleteAsync(ctx context.Context, resourceGroupName, vmssName, instanceID string) (*infrav1.Future, error) { | ||
| ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.DeleteAsync") |
There was a problem hiding this comment.
| ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.DeleteAsync") | |
| ctx, span := tele.Tracer().Start(ctx, "scalesetvms.AzureClient.DeleteAsync") |
|
|
||
| // Get retrieves the Virtual Machine Scale Set Virtual Machine | ||
| func (ac *azureClient) Get(ctx context.Context, resourceGroupName, vmssName, instanceID string) (compute.VirtualMachineScaleSetVM, error) { | ||
| ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.Get") |
There was a problem hiding this comment.
| ctx, span := tele.Tracer().Start(ctx, "scalesets.AzureClient.Get") | |
| ctx, span := tele.Tracer().Start(ctx, "scalesetvms.AzureClient.Get") |
|
I'm going to close this PR and open a new PR with an updated branch based off or the proposal in #1191. I think it will help us to apply fresh eyes and a clean slate. /close |
|
@devigned: Closed this PR. DetailsIn response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
This PR introduces API changes to AzureMachinePool to enable users of CAPZ to perform safe, fast rolling upgrades building off of #1067. In this change set
MaxSurgeandMaxUnavailablefields are introduced to theAzureMachinePoolspec.MaxSurgeenables machine pools to over-provision machines, increase the number of machines above the desired count, during an upgrade, which would allow faster upgrades at the cost of Azure compute core quota.MaxUnavailableenables one to specify how many nodes must be available during a rolling upgrade.Instance deleteenables an individual to delete Azure Machine Pool Machines individually and controller initiated node drain. As part of the instance delete / node drain state tracking forAzureMachinePoolMachines, it is advantageous to track state on these resources individually. That is why in the PR, theAzureMachinePool.Status.Instancesarray is removed in favor ofAzureMachinePoolMachineresources.AzureMachinePoolinstances into individualAzureMachinePoolMachineresources will be a breaking change in the experimental API.This PR is work in progress. Please provide early feedback.
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #819
Special notes for your reviewer:
Please confirm that if this PR changes any image versions, then that's the sole change this PR makes.
TODOs:
Release note: