Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pid limiting documentation #13006

Closed
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 44 additions & 0 deletions content/en/docs/tasks/administer-cluster/pids.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
---
reviewers:
- derekwaynecarr
- dashpole
- RobertKrawitz
title: Pid Limiting
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
title: Pid Limiting
title: Limit processes available to a pod

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not what this is; the functionality of limiting processes available to a pod is pod-max-pids. This is about limiting the number of pids available to all pods collectively (specifically, reserving a certain number of pids for system and/or kubelet use).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If enabled, the kubelet argument for pod-max-pids will write out the configured

This does look like limiting the processes available to a pod.

@derekwaynecarr - how would you feel about having two smaller pages, specific to the tasks they're explaining, and then hyperlinking between the two?

  • one page about making sure a pod doesn't use too many process IDs
  • another page that describes how to ensure that there are process IDs free for kubelet / the rest of the OS?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RobertKrawitz reading the info as it is now i got into the same thinking like @sftim .

However your explanation in the comment does makes so would you or @derekwaynecarr mind adding that info in ?

I personally would like to see more detailed info rather than high level as not everyone knows the difference as you mentioned

content_template: templates/concept
---

{{% capture overview %}}
{{< feature-state state="beta" >}}

This page explains how to configure pid limiting with the `kubelet`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"pid (process ID)" the first time. Hopefully nobody's confused, but...


Pids are a fundamental resource on Linux hosts. It is trivial to hit the task
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we suggest ways to make more process IDs available?

Eg, on Linux set kernel.pid_max via sysctl, modern kernels support 222-1 process IDs.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simply raising the process limit isn't necessarily the right answer either -- it probably means more time spent doing accounting, scanning the process table, etc. Not to mention the resources that are consumed by that many processes.

Before we go into that, I think we want to decide what level of discussion we want here.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aim I'd have here is to let a less experienced administrator realize that:

  • they can have kubelet & their container runtime limit how many pids are available to a pod
  • they can raise the system-wide pid ceiling from the default, which is typically quite low

and that making both changes together can work well.
(A high pid ceiling, system wide, helps avoid collisions when IDs are reused, and a low limit per pod protects the other pods and the rest of the system).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The system-wide pid-max is typically 32K (bigger on machines with more than 32 CPU threads available). Whether that's too small on any given system depends, but I'm not convinced it would be a good idea for an inexperienced admin to simply raise the process limit without an understanding of the workload being run and analyzing the entire system's capacity.

The per-pod limit only protects the system to the extent that the number of pods is limited.

The node limit has nothing to do with collisions; it's simply a hard upper limit on the number of simultaneous tasks in existence on a node.

limit without hitting any other resource limits and cause instability to a host
machine.

Administrators require mechanisms to ensure that user pods cannot induce pid
exhaustion that prevents host daemons (runtime, kubelet, etc) from running. In
addition, it is important to ensure that pids are limited among pods in order to
ensure they have limited impact to other workloads on the node.

{{% /capture %}}

{{% capture body %}}

## Pod to Pod Isolation of Pids

The `SupportPodPidsLimit` feature gate is *beta*.

If enabled, the `kubelet` argument for `pod-max-pids` will write out the configured
pid limit to the pod level cgroup to the value specified on Linux hosts. If -1,
the `kubelet` will default to the node allocatable pid capacity.

## Node to Pod Isolation of Pids

The `SupportNodePidsLimit` feature gate is *alpha*.

If enabled, the node allocatable feature is able to reserve a number of pids for
system components. The `pids` resource is supported when specifying `system-reserved`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Resource name is pid, not pids kubernetes/kubernetes#73651 (comment)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@RobertKrawitz i suspect this feature is related to #12932 correct ?

if so @derekwaynecarr would you mind cross linking the page here please ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is the documentation for #12932

and `kube-reserved` flags for the `kubelet`.

{{% /capture %}}
Original file line number Diff line number Diff line change
Expand Up @@ -250,4 +250,7 @@ for `kube-reserved` and `system-reserved`.
As of Kubernetes version 1.8, the `storage` key name was changed to `ephemeral-storage`
for the alpha release.

As of Kubernetes version 1.14, the `kubelet` supports specifying `pids` as a resource
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/pids/pid/

for `kube-reserved` and `system-reserved`.

{{% /capture %}}