-
Notifications
You must be signed in to change notification settings - Fork 533
CONVENTIONS: Add taints and tolerations #344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -45,7 +45,7 @@ components that make up the project, but exceptions are allowed when necessary. | |
|
|
||
| The conventions are intended to help with consistency across the project. Users | ||
| of the platform expect consistency in the experience and operation of the cluster. | ||
| Developers of the platform expect consistency in the code to quickly identify issuses | ||
| Developers of the platform expect consistency in the code to quickly identify issues | ||
| across codebases. Consistency enables shared understanding, simplifies explanations, | ||
| and reduces accidental complexity. | ||
|
|
||
|
|
@@ -85,3 +85,86 @@ metal3-io](https://github.com/metal3-io/metal3-docs/blob/master/design/bare-meta | |
| ### API | ||
|
|
||
| OpenShift APIs follow the [Kubernetes API conventions](https://github.com/kubernetes/community/blob/master/contributors/devel/sig-architecture/api-conventions.md). | ||
|
|
||
| ### Operators | ||
|
|
||
| The OpenShift project is an early adopter of, and makes extensive use of, [the | ||
| operator pattern](https://kubernetes.io/docs/concepts/extend-kubernetes/operator/), | ||
| and so it is incumbent on us to establish some conventions around operators. | ||
|
|
||
| #### Taints and Tolerations | ||
|
|
||
| An operator deployed by the CVO should run on master nodes and therefore should | ||
| tolerate the following taint: | ||
|
|
||
| * `node-role.kubernetes.io/master` | ||
|
|
||
| For example: | ||
|
|
||
| ```yaml | ||
| spec: | ||
| template: | ||
| spec: | ||
| tolerations: | ||
| - key: node-role.kubernetes.io/master | ||
| operator: Exists | ||
| effect: NoSchedule | ||
| ``` | ||
|
|
||
| Tolerating this taint should suffice for the vast majority of core OpenShift | ||
| operators. In exceptional cases, an operator may tolerate one or more of the | ||
| following taints if doing so is necessary to form a functional Kubernetes node: | ||
|
|
||
| * `node.kubernetes.io/disk-pressure` | ||
| * `node.kubernetes.io/memory-pressure` | ||
| * `node.kubernetes.io/network-unavailable` | ||
| * `node.kubernetes.io/not-ready` | ||
| * `node.kubernetes.io/pid-pressure` | ||
| * `node.kubernetes.io/unreachable` | ||
|
|
||
| Operators should not specify tolerations in their manifests for any of the taints in | ||
| the above list without an explicit and credible justification. | ||
|
|
||
| When an operator configures its operand, the operator likewise may specify | ||
| tolerations for the aforementioned taints but should do so only as necessary and only | ||
| with explicit justification. | ||
|
|
||
| Note that the DefaultTolerationSeconds and PodTolerationRestriction admission plugins | ||
| may add time-bound tolerations to an operator or operand in addition to the | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I remember @sjenning getting worked up over this, but I have no memory of what the details were |
||
| tolerations that the operator has specified. | ||
|
|
||
| If appropriate, a CRD that corresponds to an operand may provide an API to allow | ||
| users to specify a custom list of tolerations for that operand. For examples, see | ||
| the | ||
| [imagepruners.imageregistry.operator.openshift.io/v1](https://github.com/openshift/api/blob/34f54f12813aaed8822bb5bc56e97cbbfa92171d/imageregistry/v1/types_imagepruner.go#L67-L69), | ||
| [configs.imageregistry.operator.openshift.io/v1](https://github.com/openshift/api/blob/34f54f12813aaed8822bb5bc56e97cbbfa92171d/imageregistry/v1/types.go#L82-L84), | ||
| [builds.config.openshift.io/v1](https://github.com/openshift/api/blob/34f54f12813aaed8822bb5bc56e97cbbfa92171d/config/v1/types_build.go#L96-L99), | ||
| and | ||
| [ingresscontrollers.operator.openshift.io/v1](https://github.com/openshift/api/blob/34f54f12813aaed8822bb5bc56e97cbbfa92171d/operator/v1/types_ingress.go#L183-L191) | ||
| APIs. | ||
|
|
||
| In exceptional cases, an operand may tolerate all taints: | ||
|
|
||
| * if the operand is required to form a functional Kubernetes node, or | ||
|
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Apologies for being late on reviewing this PR. To me this doesn't warrant operand to have blanket tolerations. This means we should be explicit on the tolerations that need to matched for the pod to keep running. Some of the side effects that we noticed with this blanket toleration are
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Contributor
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I agree with @ravisantoshgudimetla. Having an operand tolerate all taints will definitely cause issues on clusters that have enabled Windows workloads.
Contributor
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Users can set arbitrary taints, so prohibiting operands from specifying blanket tolerations would require either (a) requiring users to specify operands' tolerations too or (b) restricting the user to some set of predefined taints.
That Bugzilla report seems to be about operators, not about operands that need to run on each node. For example, cluster-dns-operator only needs to run on some node somewhere, but the dns-node-resolver container in its daemonset operand must run any given node in order for that node to be fully functional; does specifying a blanket toleration on that operand cause a problem?
Why do you need both a node selector and a taint? Why isn't a node selector sufficient? For example, cluster-dns-operator creates a daemonset that tolerates all taints but also specifies a node selector: |
||
| * if the operand is required to support workloads sourced from an internal or external registry that core components depend upon, | ||
|
|
||
| then the operand should tolerate all taints: | ||
|
|
||
| ```yaml | ||
| spec: | ||
| template: | ||
| spec: | ||
| tolerations: | ||
| - operator: Exists | ||
| ``` | ||
|
|
||
| An example of an operand that matches the first case is kube-proxy, which is required | ||
| for services to work. An example of an operand that matches the second case is the | ||
| DNS node resolver, which adds an entry to the `/etc/hosts` file on all node hosts so | ||
| that the container runtime is able to resolve the name of the cluster image registry; | ||
| absent this entry in `/etc/hosts`, upgrades could fail to pull images of core | ||
| components. | ||
|
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I think following what's written here, we need to include the MCO as it's responsible for the overall node lifecycle and thus, the node cannot be properly configured otherwise (kubelet among others) https://bugzilla.redhat.com/show_bug.cgi?id=1780318 |
||
|
|
||
| If an operand meets neither of the two conditions listed above, it must not tolerate | ||
| all taints. This constraint is enforced by [a CI test | ||
| job](https://github.com/openshift/origin/blob/7d07adcf518a846b898cd0958b85f2daf624476a/test/extended/operators/tolerations.go). | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sjenning You worked a lot on tolerations as I recall. ptal