@@ -220,13 +220,108 @@ Since swap provisioning is out of scope of this proposal, this enhancement poses
220220
221221## Design Details
222222
223- \[ In progress\]
224-
225- Need to add specifics here for:
226-
227- - Changes to ` --fail-on-swap ` flag
228- - CRI config details
229- - Where changes will need to be made so that dockershim and the CRI are consistent with swap control
223+ ### TL;DR
224+
225+ In a nutshell, the following implementation are planned for Memory Swap Support
226+ in 1.22 GKE alpha
227+
228+ 1 . Having a feature gate ` SupportNodeMemorySwap ` guarding against the memory
229+ swap support feature
230+ 2 . Keep the default value of kubelet flag ` --fail-on-swap ` to ` true ` in order
231+ to minimize the blast radius
232+ 3 . Introducing two new kubelet config ` MemorySwapLimit ` and ` Swappiness `
233+ 4 . Introducing two new CRI parameter ` memory_swap_limit_in_bytes ` and ` memory_swappiness `
234+ 5 . End to end wiring from kubelet config file to CRI
235+
236+ ### Expected User Behaviour
237+
238+ For alpha, the feature gate ` SupportNodeMemorySwap ` is default to disabled, and
239+ ` --fail-on-swap ` flag value is the same as 1.21. Therefore, from Kubernetes
240+ user’s perspective, no behavior changes out of the box.
241+
242+ For users that are ready to explore the Memory Swap feature in 1.22 Alpha, they
243+ will need to complete the following steps
244+
245+ 1 . provision swap enable ` SupportNodeMemorySwap ` flag AND
246+ 2 . set ` --fail-on-swap ` flag to ` false `
247+
248+ Then, the user can start experimenting/fine tuning kubelet configuration
249+ ` MemorySwapLimit ` and/or ` Swappiness ` and observe the changes.
250+
251+ ### New Kubelet Configuration
252+
253+ We will be introducing two new parameters to ` KubeletConfiguration struct `
254+ defined in
255+ [ https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/config/types.go ] ( https://github.com/kubernetes/kubernetes/blob/master/pkg/kubelet/apis/config/types.go ) .
256+ These two configurations, if set, will apply to every container of the Node
257+ where kubelet is running.
258+
259+ | Name| Description| Default Value| Feature Gate|
260+ | --- | --- | --- | --- |
261+ | MemorySwapLimit| This parameter sets total memory limit (memory + swap). This limits the total amount of memory this container is allowed to swap to disk.| -2, which enable disable swap| SupportNodeMemorySwap|
262+ | MemorySwappiness| This configuration sets how aggressively the kernel will swap memory pages. By default, the host kernel can swap out a percentage of anonymous pages used by a container. Users can set value between 0 and 100, to tune this percentage.| Unset, which will use host value| SupportNodeMemorySwap|
263+
264+ #### MemorySwapLimit details
265+
266+ MemorySwapLimit configuration is a kubelet flag that only takes effect on a
267+ container that has a memory limit set, either explicitly from
268+ [ PodSpec] ([ https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits ] ( https://kubernetes.io/docs/concepts/configuration/manage-resources-containers/#requests-and-limits )
269+ ) or implicitly from [ Resource
270+ Quota] ([ https://kubernetes.io/docs/concepts/policy/resource-quotas/ ] ( https://kubernetes.io/docs/concepts/policy/resource-quotas/ )
271+ ).
272+
273+ For container with memory limit set, MemorySwapLimit setting will have the
274+ following effects, [ similar to
275+ docker] ( https://docs.docker.com/config/containers/resource_constraints/#--memory-swap-details )
276+
277+ * If MemorySwapLimit is set to a positive integer,
278+ * If the memory limit of the container is greater or equal to
279+ MemorySwapLimit, then no swap is allowed, the container does not have
280+ access to swap.
281+ * If the memory limit of the container is less than MemorySwapLimit, then
282+ MemorySwapLimit represents the total amount of memory and swap that can be
283+ used. For example, for a container with memory limit set to 300m, and
284+ ` MemorySwapLimit ` set to 1g, the container can use 300m of memory and 700m (1g
285+ - 300m) swap.
286+ * If MemorySwapLimit is set to 0, for containers with memory limit is set, the
287+ container can use as much swap as the Memory limit setting, if the host
288+ container has swap memory configured. For instance, if a container requests
289+ memory="300m" and MemorySwapLimit is not set, the container can use 600m in
290+ total of memory and swap.
291+ * If MemorySwapLimit is explicitly set to -1, the container is allowed to use
292+ unlimited swap, up to the amount available on the host system.
293+ * If MemorySwapLimit is explicitly set to -2, the container does not have
294+ access to swap. This value effectively prevents a container from using swap.
295+
296+ In summary, for users experimenting with this feature
297+
298+ | MemorySwapLimit| container memory limit (explicit or implicit)| Expected Behavior| Comment|
299+ | --- | --- | --- | --- |
300+ | Any| not set| N/A| Same as docker|
301+ | -2| N| no swap allowed, this is the default value||
302+ | -1| N| unlimited swap| Same as docker|
303+ | 0| N| container can use up to N swap (ie: 2N memory+swap)| Same as docker|
304+ | X where X > 0| N where N < X| container can use up to X-N swap (ie: 2N memory+swap)| Same as docker|
305+ | X where X > 0| N where N >= X| no swap allowed (ie: N memory only)| Same as docker|
306+
307+ #### MemorySwappiness details
308+
309+ * A value of 0 turns off anonymous page swapping.
310+ * A value of 100 sets all anonymous pages as swappable.
311+ * By default, if you do not set MemorySwappiness, the value is inherited from
312+ the host machine.
313+
314+ ### CRI Changes
315+
316+ We will be introducing the following two parameters
317+ ` memory_swap_limit_in_bytes ` and ` memory_swappiness ` to `message
318+ LinuxContainerResources` defined in
319+ [ https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1/api.proto#L563-L580 ] ( https://github.com/kubernetes/kubernetes/blob/master/staging/src/k8s.io/cri-api/pkg/apis/runtime/v1/api.proto#L563-L580 )
320+
321+ | Name| Type| Description| Default Value| Feature Gate|
322+ | --- | --- | --- | --- | --- |
323+ | ` memory_swap_limit_in_bytes ` | int64| set/show limit of memory+swap usage| Default 0, which is unspecified.| SupportNodeMemorySwap|
324+ | ` memory_swappiness ` | int64| set/show swappiness parameter| Default 0, which is unspecified.| SupportNodeMemorySwap|
230325
231326### Test Plan
232327
0 commit comments