diff --git a/_data/concepts.yml b/_data/concepts.yml index f165490883ad8..bf6f8b53e842e 100644 --- a/_data/concepts.yml +++ b/_data/concepts.yml @@ -43,7 +43,6 @@ toc: section: - docs/concepts/cluster-administration/network-plugins.md - docs/concepts/cluster-administration/device-plugins.md - - docs/concepts/cluster-administration/sysctl-cluster.md - docs/concepts/service-catalog/index.md - title: Containers diff --git a/_data/tasks.yml b/_data/tasks.yml index e9210213bc90b..d7c03e36ee370 100644 --- a/_data/tasks.yml +++ b/_data/tasks.yml @@ -143,6 +143,7 @@ toc: - docs/tasks/administer-cluster/access-cluster-api.md - docs/tasks/administer-cluster/access-cluster-services.md - docs/tasks/administer-cluster/securing-a-cluster.md + - docs/tasks/administer-cluster/sysctl-cluster.md - docs/tasks/administer-cluster/encrypt-data.md - docs/tasks/administer-cluster/configure-upgrade-etcd.md - docs/tasks/administer-cluster/static-pod.md diff --git a/_redirects b/_redirects index 3d5c08fe03db3..48d26e389267d 100644 --- a/_redirects +++ b/_redirects @@ -50,7 +50,7 @@ /docs/admin/resourcequota/limitstorageconsumption/ /docs/tasks/administer-cluster/limit-storage-consumption/ 301 /docs/admin/resourcequota/walkthrough/ /docs/tasks/administer-cluster/quota-api-object/ 301 /docs/admin/static-pods/ /docs/tasks/administer-cluster/static-pod/ 301 -/docs/admin/sysctls/ /docs/concepts/cluster-administration/sysctl-cluster/ 301 +/docs/admin/sysctls/ /docs/tasks/administer-cluster/sysctl-cluster/ 301 /docs/admin/upgrade-1-6/ /docs/tasks/administer-cluster/upgrade-1-6/ 301 /docs/admin/resource-quota/ /docs/concepts/policy/resource-quotas/ 301 @@ -97,6 +97,7 @@ /docs/concepts/cluster-administration/multiple-clusters/ /docs/concepts/cluster-administration/federation/ 301 /docs/concepts/cluster-administration/out-of-resource/ /docs/tasks/administer-cluster/out-of-resource/ 301 /docs/concepts/cluster-administration/resource-usage-monitoring /docs/tasks/debug-application-cluster/resource-usage-monitoring/ 301 +/docs/concepts/cluster-administration/sysctl-cluster/ /docs/tasks/administer-cluster/sysctl-cluster/ 301 /docs/concepts/cluster-administration/static-pod/ /docs/tasks/administer-cluster/static-pod/ 301 /docs/concepts/clusters/logging/ /docs/concepts/cluster-administration/logging/ 301 /docs/concepts/configuration/container-command-arg/ /docs/tasks/inject-data-application/define-command-argument-container/ 301 diff --git a/docs/concepts/cluster-administration/sysctl-cluster.md b/docs/tasks/administer-cluster/sysctl-cluster.md similarity index 81% rename from docs/concepts/cluster-administration/sysctl-cluster.md rename to docs/tasks/administer-cluster/sysctl-cluster.md index 796c735bb2871..9c8a8fcc49b46 100644 --- a/docs/concepts/cluster-administration/sysctl-cluster.md +++ b/docs/tasks/administer-cluster/sysctl-cluster.md @@ -1,15 +1,24 @@ --- +title: Using Sysctls in a Kubernetes Cluster reviewers: - sttts -title: Using Sysctls in a Kubernetes Cluster --- -* TOC -{:toc} +{% capture overview %} This document describes how sysctls are used within a Kubernetes cluster. -## What is a Sysctl? +{% endcapture %} + +{% capture prerequisites %} + +{% include task-tutorial-prereqs.md %} + +{% endcapture %} + +{% capture steps %} + +## Listing all Sysctl Parameters In Linux, the sysctl interface allows an administrator to modify kernel parameters at runtime. Parameters are available via the `/proc/sys/` virtual @@ -23,35 +32,11 @@ process file system. The parameters cover various subsystems such as: To get a list of all parameters, you can run -``` +```shell $ sudo sysctl -a ``` -## Namespaced vs. Node-Level Sysctls - -A number of sysctls are _namespaced_ in today's Linux kernels. This means that -they can be set independently for each pod on a node. Being namespaced is a -requirement for sysctls to be accessible in a pod context within Kubernetes. - -The following sysctls are known to be _namespaced_: - -- `kernel.shm*`, -- `kernel.msg*`, -- `kernel.sem`, -- `fs.mqueue.*`, -- `net.*`. - -Sysctls which are not namespaced are called _node-level_ and must be set -manually by the cluster admin, either by means of the underlying Linux -distribution of the nodes (e.g. via `/etc/sysctls.conf`) or using a DaemonSet -with privileged containers. - -**Note**: it is good practice to consider nodes with special sysctl settings as -_tainted_ within a cluster, and only schedule pods onto them which need those -sysctl settings. It is suggested to use the Kubernetes [_taints and toleration_ -feature](/docs/user-guide/kubectl/{{page.version}}/#taint) to implement this. - -## Safe vs. Unsafe Sysctls +## Enabling Unsafe Sysctls Sysctls are grouped into _safe_ and _unsafe_ sysctls. In addition to proper namespacing a _safe_ sysctl must be properly _isolated_ between pods on the same @@ -63,8 +48,7 @@ node. This means that setting a _safe_ sysctl for one pod of a pod. By far, most of the _namespaced_ sysctls are not necessarily considered _safe_. - -For Kubernetes 1.4, the following sysctls are supported in the _safe_ set: +The following sysctls are supported in the _safe_ set: - `kernel.shm_rmid_forced`, - `net.ipv4.ip_local_port_range`, @@ -82,31 +66,45 @@ All _unsafe_ sysctls are disabled by default and must be allowed manually by the cluster admin on a per-node basis. Pods with disabled unsafe sysctls will be scheduled, but will fail to launch. -**Warning**: Due to their nature of being _unsafe_, the use of _unsafe_ sysctls -is at-your-own-risk and can lead to severe problems like wrong behavior of -containers, resource shortage or complete breakage of a node. - -## Enabling Unsafe Sysctls - With the warning above in mind, the cluster admin can allow certain _unsafe_ sysctls for very special situations like e.g. high-performance or real-time application tuning. _Unsafe_ sysctls are enabled on a node-by-node basis with a flag of the kubelet, e.g.: ```shell -$ kubelet --experimental-allowed-unsafe-sysctls 'kernel.msg*,net.ipv4.route.min_pmtu' ... +$ kubelet --experimental-allowed-unsafe-sysctls \ + 'kernel.msg*,net.ipv4.route.min_pmtu' ... ``` + For minikube, this can be done via the `extra-config` flag: ```shell $ minikube start --extra-config="kubelet.AllowedUnsafeSysctls=kernel.msg*,net.ipv4.route.min_pmtu"... ``` + Only _namespaced_ sysctls can be enabled this way. ## Setting Sysctls for a Pod -The sysctl feature is an alpha API in Kubernetes 1.4. Therefore, sysctls are set -using annotations on pods. They apply to all containers in the same pod. +A number of sysctls are _namespaced_ in today's Linux kernels. This means that +they can be set independently for each pod on a node. Being namespaced is a +requirement for sysctls to be accessible in a pod context within Kubernetes. + +The following sysctls are known to be _namespaced_: + +- `kernel.shm*`, +- `kernel.msg*`, +- `kernel.sem`, +- `fs.mqueue.*`, +- `net.*`. + +Sysctls which are not namespaced are called _node-level_ and must be set +manually by the cluster admin, either by means of the underlying Linux +distribution of the nodes (e.g. via `/etc/sysctls.conf`) or using a DaemonSet +with privileged containers. + +The sysctl feature is an alpha API. Therefore, sysctls are set using annotations +on pods. They apply to all containers in the same pod. Here is an example, with different annotations for _safe_ and _unsafe_ sysctls: @@ -121,11 +119,25 @@ metadata: spec: ... ``` +{% endcapture %} + +{% capture discussion %} -**Note**: a pod with the _unsafe_ sysctls specified above will fail to launch on -any node which has not enabled those two _unsafe_ sysctls explicitly. As with -_node-level_ sysctls it is recommended to use [_taints and toleration_ -feature](/docs/user-guide/kubectl/{{page.version}}/#taint) or [taints on nodes](/docs/concepts/configuration/taint-and-toleration/) +**Warning**: Due to their nature of being _unsafe_, the use of _unsafe_ sysctls +is at-your-own-risk and can lead to severe problems like wrong behavior of +containers, resource shortage or complete breakage of a node. +{: .warning} + +It is good practice to consider nodes with special sysctl settings as +_tainted_ within a cluster, and only schedule pods onto them which need those +sysctl settings. It is suggested to use the Kubernetes [_taints and toleration_ +feature](/docs/user-guide/kubectl/{{page.version}}/#taint) to implement this. + +A pod with the _unsafe_ sysctls will fail to launch on any node which has not +enabled those two _unsafe_ sysctls explicitly. As with _node-level_ sysctls it +is recommended to use +[_taints and toleration_ feature](/docs/user-guide/kubectl/{{page.version}}/#taint) or +[taints on nodes](/docs/concepts/configuration/taint-and-toleration/) to schedule those pods onto the right nodes. ## PodSecurityPolicy Annotations @@ -148,3 +160,7 @@ metadata: spec: ... ``` + +{% endcapture %} + +{% include templates/task.md %}