feat: Supports upper-layer modification of the InstanceSet's UpdateStrategy #7939

YTGhost · 2024-08-07T07:27:01Z

resolve #7913

apis/apps/v1alpha1/cluster_types.go

apis/workloads/v1alpha1/instanceset_types.go

weicao · 2024-08-10T12:34:30Z

apis/workloads/v1alpha1/instanceset_types.go

+	// Partition are updated. All pods from ordinal Partition-1 to 0 remain untouched.
+	// This is helpful in being able to do a canary based deployment. The default value is 0.
+	// +optional
+	Partition *int32 `json:"partition,omitempty"`


Do we need partition? @free6om

After introducing instance templates, the naming of Pods managed by an InstanceSet follows this pattern: "comp-0, comp-1, comp-tpl0-0, comp-tpl1-0, comp-tpl1-1"

Unlike before, the Pods no longer have a linear ordinal numbering scheme. This makes specifying a partition much more challenging.

Yes, I feel the same way. Currently, the partition feature allows Pods to be updated in a rolling update based on dictionary order from largest to smallest, but it seems there is no way to perform a rolling update on a specific Template at the moment.

How we do a RollingUpdate then?

weicao · 2024-08-10T12:43:21Z

apis/workloads/v1alpha1/instanceset_types.go

+	// That means if there is any unavailable pod in the range 0 to Replicas-1,
+	// it will be counted towards MaxUnavailable.
+	// +optional
+	MaxUnavailable *intstr.IntOrString `json:"maxUnavailable,omitempty"`


The term 'maxUnavailable' is misleading. In the original StatefulSet, updates always involved restarting pods, and any update to a pod would cause it to become unavailable. Therefore, controlling the number of maxUnavailable pods represented the level of concurrency.

However, in InstanceSet, if we continue to use 'maxUnavailable', strictly speaking, all non-restarting updates (i.e., those that don't cause the pod to become unavailable) would not be controlled by this parameter.

Why should we control the non-restarting updates as they are taking effect instantly after the Pods being updated? Do all the non-restarting updates in one reconciliation loop seems no difference with doing them in several reconciliation loops.

Because 'non-restarting update' is also a change in production environment, it should be able to be controlled by a gradual upgrade strategy. For example, it's reasonable to allow users to make changes to a small number of replicas first, and after verification, gradually roll out to more replicas.

e.g. A 'non-restarting update' might involve parameter modifications, which could potentially lead to issues such as performance degradation. Or, it might be an IP whitelist change, which could inadvertently block legitimate traffic from apps.

To summarize, incorrect 'non-restarting updates' can potentially harm business continuity. Therefore, users may require a gradual update process. This is something we need to consider.

@weicao I think, for the 'non-restarting update,' it seems that we don't have a way to automate the control. How about we use partition to manually control it instead?

@weicao I think, for the 'non-restarting update,' it seems that we don't have a way to automate the control. How about we use partition to manually control it instead?

When you have multiple instance templates, how do you plan to use partitions?

What I mean is, say if you have 3 templates and a total of 10 replicas, when you set the partition to 7 (upgrading the 8th, 9th, and 10th replicas), it becomes difficult to determine which templates these 8th, 9th, and 10th replicas belong to. This makes it challenging to verify whether the updated replicas meet the expected outcomes.

What I mean is, say if you have 3 templates and a total of 10 replicas, when you set the partition to 7 (upgrading the 8th, 9th, and 10th replicas), it becomes difficult to determine which templates these 8th, 9th, and 10th replicas belong to. This makes it challenging to verify whether the updated replicas meet the expected outcomes.

@weicao We have also considered this issue. However, based on our current requirements, we do not need to specify a separate partition for gray upgrades in multiple templates. Nevertheless, I understand the need for multi-template partitioning, and we can discuss and design a solution for this.

Currently, our requirement is only for the partition to be able to perform a global update in alphabetical order.

weicao · 2024-08-10T12:50:59Z

apis/workloads/v1alpha1/instanceset_types.go

+	//
+	// +kubebuilder:validation:Enum={Serial,BestEffortParallel,Parallel}
+	// +optional
+	MemberUpdateStrategy *MemberUpdateStrategy `json:"memberUpdateStrategy,omitempty"`


'memberUpdateStrategy' and 'maxUnavailable' are not orthogonal.

When 'memberUpdateStrategy' is set to 'serial', 'maxUnavailable' has no effect, right?

When 'memberUpdateStrategy' is set to 'bestEffortParallel', the concurrency is calculated based on quorum, so 'maxUnavailable' should not have an effect either, right?

Therefore, does 'maxUnavailable' only take effect when 'memberUpdateStrategy' is set to 'parallel'?

Yes, but MaxUnavailable means no more than a total number of Pods defined by MaxAvailable should be unavailable. It's upper bound.

so 'maxUnavailable' takes effect when 'memberUpdateStrategy' is set to 'bestEffortParallel' and 'parallel'?

…rategy Signed-off-by: Liang Deng <[email protected]>

YTGhost requested review from nayutah, ldming, heng4fun, free6om, wangyelei, Y-Rookie and weicao as code owners August 7, 2024 07:27

github-actions bot added the size/XL Denotes a PR that changes 500-999 lines. label Aug 7, 2024

apecloud-bot added the area/user-interaction label Aug 7, 2024

apecloud-bot requested a review from realzyy August 7, 2024 07:27

apecloud-bot added the pre-approve Fork PR Pre Approve Test label Aug 7, 2024

YTGhost force-pushed the feat-updateStrategy branch from e4d3405 to 4cddc13 Compare August 7, 2024 07:48

apecloud-bot added pre-approve Fork PR Pre Approve Test and removed pre-approve Fork PR Pre Approve Test labels Aug 7, 2024

YTGhost force-pushed the feat-updateStrategy branch from 4cddc13 to 7821276 Compare August 7, 2024 08:59

apecloud-bot added pre-approve Fork PR Pre Approve Test and removed pre-approve Fork PR Pre Approve Test labels Aug 7, 2024

weicao reviewed Aug 10, 2024

View reviewed changes

apis/apps/v1alpha1/cluster_types.go Outdated Show resolved Hide resolved

weicao reviewed Aug 10, 2024

View reviewed changes

apis/workloads/v1alpha1/instanceset_types.go Outdated Show resolved Hide resolved

weicao reviewed Aug 10, 2024

View reviewed changes

YTGhost force-pushed the feat-updateStrategy branch from 7821276 to c33642b Compare August 13, 2024 07:45

apecloud-bot added pre-approve Fork PR Pre Approve Test and removed pre-approve Fork PR Pre Approve Test labels Aug 13, 2024

YTGhost force-pushed the feat-updateStrategy branch from c33642b to f84bd1e Compare August 14, 2024 12:13

apecloud-bot added pre-approve Fork PR Pre Approve Test and removed pre-approve Fork PR Pre Approve Test labels Aug 14, 2024

YTGhost force-pushed the feat-updateStrategy branch from f84bd1e to 06e5a75 Compare August 15, 2024 07:38

apecloud-bot removed the pre-approve Fork PR Pre Approve Test label Aug 15, 2024

apecloud-bot added the pre-approve Fork PR Pre Approve Test label Aug 15, 2024

YTGhost force-pushed the feat-updateStrategy branch from 06e5a75 to 044523d Compare August 15, 2024 07:47

apecloud-bot added pre-approve Fork PR Pre Approve Test and removed pre-approve Fork PR Pre Approve Test labels Aug 15, 2024

feat: Supports upper-layer modification of the InstanceSet's UpdateSt…

e1de8d6

…rategy Signed-off-by: Liang Deng <[email protected]>

YTGhost force-pushed the feat-updateStrategy branch from 044523d to e1de8d6 Compare August 15, 2024 07:54

apecloud-bot added pre-approve Fork PR Pre Approve Test and removed pre-approve Fork PR Pre Approve Test labels Aug 15, 2024

YTGhost mentioned this pull request Aug 15, 2024

feat: Supports upper-layer modification of the InstanceSet's UpdateStrategy #7978

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Supports upper-layer modification of the InstanceSet's UpdateStrategy #7939

feat: Supports upper-layer modification of the InstanceSet's UpdateStrategy #7939

YTGhost commented Aug 7, 2024

weicao Aug 10, 2024

YTGhost Aug 12, 2024 •

edited

Loading

free6om Aug 12, 2024

weicao Aug 10, 2024 •

edited

Loading

free6om Aug 12, 2024

weicao Aug 13, 2024 •

edited

Loading

weicao Aug 13, 2024

YTGhost Aug 13, 2024

weicao Aug 14, 2024 •

edited

Loading

weicao Aug 14, 2024

YTGhost Aug 14, 2024

YTGhost Aug 14, 2024

weicao Aug 10, 2024

free6om Aug 12, 2024

weicao Aug 13, 2024

feat: Supports upper-layer modification of the InstanceSet's UpdateStrategy #7939

Are you sure you want to change the base?

feat: Supports upper-layer modification of the InstanceSet's UpdateStrategy #7939

Conversation

YTGhost commented Aug 7, 2024

Choose a reason for hiding this comment

YTGhost Aug 12, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weicao Aug 10, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weicao Aug 13, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

weicao Aug 14, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

YTGhost Aug 12, 2024 •

edited

Loading

weicao Aug 10, 2024 •

edited

Loading

weicao Aug 13, 2024 •

edited

Loading

weicao Aug 14, 2024 •

edited

Loading