Skip to content

Commit

Permalink
Updated doc for ScheduleDaemonSetPods alpha feature.
Browse files Browse the repository at this point in the history
Signed-off-by: Da K. Ma <[email protected]>
  • Loading branch information
k82cn committed Jun 1, 2018
1 parent 6029cc3 commit 0923e72
Show file tree
Hide file tree
Showing 3 changed files with 47 additions and 5 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -210,6 +210,7 @@ currently include:
* `node.kubernetes.io/memory-pressure`: Node has memory pressure.
* `node.kubernetes.io/disk-pressure`: Node has disk pressure.
* `node.kubernetes.io/network-unavailable`: Node's network is unavailable.
* `node.kubernetes.io/unschedulable`: Node is unschedulable. (1.10 or later)
* `node.cloudprovider.kubernetes.io/uninitialized`: When kubelet is started
with "external" cloud provider, it sets this taint on a node to mark it
as unusable. When a controller from the cloud-controller-manager initializes
Expand Down Expand Up @@ -275,6 +276,7 @@ To make sure that turning on this feature doesn't break DaemonSets, starting in
* `node.kubernetes.io/memory-pressure`
* `node.kubernetes.io/disk-pressure`
* `node.kubernetes.io/out-of-disk` (*only for critical pods*)
* `node.kubernetes.io/unschedulable` (1.10 or later)

The above settings ensure backward compatibility, but we understand they may not fit all user's needs, which is why
cluster admin may choose to add arbitrary tolerations to DaemonSets.
49 changes: 44 additions & 5 deletions content/en/docs/concepts/workloads/controllers/daemonset.md
Original file line number Diff line number Diff line change
Expand Up @@ -97,7 +97,9 @@ If you do not specify either, then the DaemonSet controller will create Pods on

## How Daemon Pods are Scheduled

Normally, the machine that a Pod runs on is selected by the Kubernetes scheduler. However, Pods
### Scheduled by DaemonSet controller (default)

Normally, the machine that a Pod runs on is selected by the Kubernetes scheduler. However, Pods
created by the DaemonSet controller have the machine already selected (`.spec.nodeName` is specified
when the Pod is created, so it is ignored by the scheduler). Therefore:

Expand All @@ -106,6 +108,39 @@ when the Pod is created, so it is ignored by the scheduler). Therefore:
- The DaemonSet controller can make Pods even when the scheduler has not been started, which can help cluster
bootstrap.

### Scheduled by default scheduler (with ScheduleDaemonSetPods alpha feature)

In version 1.11, `ScheduleDaemonSetPods` was introduced as an alpha feature. With this feature,
DaemonSet Pods are scheduled by default scheduler, instead of Daemonset controller.

When this feature is enabled, the `NodeAffinity` term (instead of `.spec.nodeName`) is added to the DaemonSet Pods,
as in the following example. This enables the default scheduler to bind the Pod to the target host. If node affinity of
DaemonSet Pod already exists, it will be replaced. DaemonSet controller will only perform these operations when creating
DaemonSet Pods; and those operations will only modify the Pods of DaemonSet, no changes are made to the `.spec.template`
of DaemonSet.

```
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchFields:
- key: metadata.name
operator: In
values:
- target-host-name
```

When this feature is enabled, `node.kubernetes.io/unschedulable:NoSchedule` toleration is added automatically to
DaemonSet Pods. DaemonSet controller was ignoring `unschedulable` Nodes when scheduling DaemonSet Pods.
In order to ensure that default scheduler keeps the same behavior and schedules DaemonSet Pods on `unschedulable` nodes,
`TaintNodesByCondition` must be enabled.

When this feature and `TaintNodesByCondition` are enabled together, `node.kubernetes.io/network-unavailable:NoSchedule`
toleration is required to DaemonSet pods that using host network.


### Taints and Tolerations

Daemon Pods do respect [taints and tolerations](/docs/concepts/configuration/taint-and-toleration),
but they are created with `NoExecute` tolerations for the following taints with no `tolerationSeconds`:

Expand All @@ -117,18 +152,22 @@ they will not be evicted when there are node problems such as a network partitio
`TaintBasedEvictions` feature is not enabled, they are also not evicted in these scenarios, but
due to hard-coded behavior of the NodeController rather than due to tolerations).

They also tolerate following `NoSchedule` taints:
They also tolerate following `NoSchedule` taints:

- `node.kubernetes.io/memory-pressure`
- `node.kubernetes.io/disk-pressure`
- `node.kubernetes.io/memory-pressure`
- `node.kubernetes.io/unschedulable` (after version 1.10)

When the support to critical pods is enabled and the pods in a DaemonSet are
labeled as critical, the Daemon pods are created with an additional
`NoSchedule` toleration for the `node.kubernetes.io/out-of-disk` taint.

Note that all above `NoSchedule` taints above are created only in version 1.8 or later if the alpha feature `TaintNodesByCondition` is enabled.
Note that all above `NoSchedule` taints are created only in version 1.8 or later
if the alpha feature `TaintNodesByCondition` is enabled.

Also note that the `node-role.kubernetes.io/master` `NoSchedule` toleration specified in the above example
is needed in version 1.6 or later to schedule on *master* nodes, because this is not a default toleration.

Also note that the `node-role.kubernetes.io/master` `NoSchedule` toleration specified in the above example is needed on 1.6 or later to schedule on *master* nodes as this is not a default toleration.

## Communicating with Daemon Pods

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,7 @@ different Kubernetes components.
| `TokenRequest` | `false` | Alpha | 1.10 | |
| `VolumeScheduling` | `false` | Alpha | 1.9 | 1.9 |
| `VolumeScheduling` | `true` | Beta | 1.10 | |
| `ScheduleDaemonSetPods` | `false` | Alpha | 1.11 | |

## Using a Feature

Expand Down

0 comments on commit 0923e72

Please sign in to comment.