Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -508,8 +508,7 @@ enhancement:
CRI or CNI may require updating that component before the kubelet.
-->

If we have a `V+1` version of api-server(`V` refers to the kubernetes version we support this feature), and a `V` version scheduler,
we'll just ignore `NodeInclusionPolicy` with no side effects.
Kube-scheduler generally has the same version as api-server. So no version skew strategy.

## Production Readiness Review Questionnaire

Expand Down Expand Up @@ -620,7 +619,79 @@ Longer term, we may want to require automated upgrade/rollback tests, but we
are missing a bunch of machinery and tooling and can't do that now.
-->

Not yet, but it will be tested manually prior to upgrade.
Not yet, but it will be tested manually prior to upgrade following below steps:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The scenario LGTM - please update the PRR once you run it (fine to do this after feature freeze, but please do that before graduating the feature to beta in k/k).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then let's merge this PR first for I may leave several days.. And I'll update this section after.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SG


1. Install kubernetes v1.24 cluster with two workloads via installation tools like Kind.
2. Let's name these nodes as node1 and node2, both labelled with key `kubernetes.io/hostname`.
3. Add a taint to node1 like `foo=bar:NoSchedule`
4. Apply a deployment like:

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
selector:
matchLabels:
foo: bar
template:
metadata:
labels:
foo: bar
spec:
restartPolicy: Always
containers:
- name: nginx
image: nginx:1.14.2
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
```

5. We'll see one pod pending.
6. Delete the deployment via `kubectl delete -f`.
7. Configure the api-server with feature-gate `NodeInclusionPolicyInPodTopologySpread` enabled.
8. Redeploy the deployment with `NodeTaintsPolicy` honored.

```yaml
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
selector:
matchLabels:
foo: bar
template:
metadata:
labels:
foo: bar
spec:
restartPolicy: Always
containers:
- name: nginx
image: nginx:1.14.2
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
NodeTaintsPolicy: Honor
labelSelector:
matchLabels:
foo: bar
```

9. All pods will be allocated successfully.
10. Delete the deployment.
11. Disable the feature gate with api-server restarted.
12. Apply the deployment for the third time, we'll see one pending again.

###### Is the rollout accompanied by any deprecations and/or removals of features, APIs, fields of API types, flags, etc.?
No
Expand Down Expand Up @@ -666,69 +737,8 @@ Recall that end users cannot usually observe component logs or access metrics.
- Details: -->

- [x] Other (treat as last resort)
- Details:

1. Launch a cluster with two workload nodes, let's say node1 and node2, both has label key `kubernetes.io/hostname`.
2. Add a taint to node1 which ready-to-deploy pods will not tolerate.
3. Apply a deployment looks like:
```
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
selector:
matchLabels:
foo: bar
template:
metadata:
labels:
foo: bar
spec:
restartPolicy: Always
containers:
- name: nginx
image: nginx:1.14.2
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
labelSelector:
matchLabels:
foo: bar
```
4. One pod will be pending.
5. Enable the NodeInclusionPolicy and redeploy the deployment:
```
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx
spec:
replicas: 2
selector:
matchLabels:
foo: bar
template:
metadata:
labels:
foo: bar
spec:
restartPolicy: Always
containers:
- name: nginx
image: nginx:1.14.2
topologySpreadConstraints:
- maxSkew: 1
topologyKey: kubernetes.io/hostname
whenUnsatisfiable: DoNotSchedule
NodeTaintsPolicy: Honor
labelSelector:
matchLabels:
foo: bar
```
6. Both pods will be scheduled successfully.
- Details: We can only observe the behaviors based on pod scheduling results.
We can follow the steps described at [Upgrade / Downgrade Strategy](#upgrade--downgrade-strategy).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this related?
I would just remove this line

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In case someone want to know the details about how to reproduce this. I'll remove this.


###### What are the reasonable SLOs (Service Level Objectives) for the enhancement?

Expand Down Expand Up @@ -779,7 +789,8 @@ Describe the metrics themselves and the reasons why they weren't added (e.g., co
implementation difficulties, etc.).
-->

No.
Yes, we have a plan to improve observability via metrics [here](https://github.com/kubernetes/kubernetes/issues/110643),
but still on the way.

### Dependencies

Expand Down Expand Up @@ -920,7 +931,9 @@ Configuration errors are logged to stderr.

###### What steps should be taken if SLOs are not being met to determine the problem?

Check the metrics to see whether it was caused by this feature, if so, disable the feature gate.
If we see obviously performance degradation or error rate going up with this feature gate enabled,
we should disable it ASAP, and restart the apiserver. If we have fewer workloads, we can disable the
policy in `PodTopologySpread` one by one for emergency.

## Implementation History

Expand Down