Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change default inotify settings #1525

Closed
backjo opened this issue Apr 26, 2021 · 5 comments · Fixed by #2335
Closed

Change default inotify settings #1525

backjo opened this issue Apr 26, 2021 · 5 comments · Fixed by #2335
Assignees
Labels
area/core Issues core to the OS (variant independent) good first issue May be a good issue for new contributors type/enhancement New feature or request
Milestone

Comments

@backjo
Copy link

backjo commented Apr 26, 2021

What I'd like:

In #371, there was discussion about adopting the learning from From 30 to 230 docker containers per host . As part of this discussion, it looks like the inotify limits were discussed, but the default settings seemed to have been interpreted as what the settings should have been changed to ( see discrepancy in the settings here compared to the referenced article). Tuning these settings to the recommendations in the article:

fs.inotify.max_user_instances = 4096
fs.inotify.max_user_watches = 32768

will allow for more containers to run per node, assuming they're hitting the inotify limits already.
Any alternatives you've considered:
Building a custom image with these settings tweaked. This will always be something that heavy users of inotify will have to do, but tweaking the defaults will probably provide a better developer experience for your average operator.

@zmrow
Copy link
Contributor

zmrow commented Apr 27, 2021

Hi @backjo ! Thanks for bringing this to our attention; I barely remember discussing those things in the very early days of Bottlerocket. :)

Are you running into these limits or having to tweak them on your end? Can you provide some additional details about your use case?

@backjo
Copy link
Author

backjo commented Apr 27, 2021

Our use case is more or less described by this post -> http://blog.travisgosselin.com/configured-user-limit-inotify-instances/.

Basically, we've got a bunch of .NET Core pods running across various K8S namespaces on our EKS cluster. Each of those pods shares a base container image and leverages the same user ('app'). If enough of these pods end up running on the same node, we start seeing failures due to hitting the inotify limit.

We can (and do) workaround this case in a few ways - per-app UIDs, do less file watches, etc - but I think a higher default value might make sense.

Note: max_user_watches was increased significantly in the AL2 EKS AMI a few months back - awslabs/amazon-eks-ami#589

@jhaynes jhaynes added this to the backlog milestone May 3, 2021
@jhaynes jhaynes added area/core Issues core to the OS (variant independent) priority/p2 type/enhancement New feature or request and removed priority/p2 labels May 3, 2021
@ehedei206
Copy link

Hi, is there an update on this? Is there any way we can change fs.inotify.max_user_instances through the bottlerocket api in the control container? or userData toml?

Thanks!

@duboisf
Copy link

duboisf commented Aug 1, 2022

I've encountered an issue where the default value of fs.inotify.max_user_instances was too low (128 in my case).

To reproduce this in kubernetes, I created a troubleshoot karpenter provisioner that would only provision largish Bottlerocket nodes, something like:

# ...
labels:
  dedicated: troubleshoot
# ...
- key: karpenter.k8s.aws/instance-size
  operator: In
  values:
  - 24xlarge

Then I deployed a simple app in an istio-enabled namespace, with 100 replicas and a node selector dedicated: troubleshoot. Not all the istio proxy sidecars become ready, roughly a third crash with a "too many open files" error. Increasing fs.inotify.max_user_instances to 1024 solved the issue here.


@ehedei206 to answer your question about if there's a way to fix this using user data, yes there is. I used the following user data for my Bottlerocket nodes in kubernetes:

[settings.kernel.sysctl]
"fs.inotify.max_user_instances" = "1024"
"fs.inotify.max_user_watches" = "32768"

EDIT: typos

@kdaula kdaula modified the milestones: backlog, 1.10.0 Aug 11, 2022
@kdaula kdaula added the good first issue May be a good issue for new contributors label Aug 11, 2022
@ehedei206
Copy link

Thanks @duboisf!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/core Issues core to the OS (variant independent) good first issue May be a good issue for new contributors type/enhancement New feature or request
Projects
Development

Successfully merging a pull request may close this issue.

7 participants