-
Notifications
You must be signed in to change notification settings - Fork 523
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MySQL 5.7 container OOM kill / changing the NOFILE limit #1136
Comments
Thanks for putting in this issue! The AL2 EKS AMI sets Based on the comments in the containerd unit file, setting There is also an open Kubernetes issue to make limits configurable. |
Hey @zmrow, thanks for the response! That makes sense, and what I suspected was going on. I agree with your position that changing the default in bottlerocket is not a good solution, though I wonder if providing it as a config option would at least make sense so people coming from the AL2 EKS AMI have a way to mitigate this issue? It would be great to see this supported upstream, though seems like an issue that has been open for a long time without too much movement. As for the other part of my question, is there a recommend way you would suggest to try and change this in bottlerocket? Seems like this type of thing might be possible by defining a custom container? Or would I definitely need a custom build? Any guidance you can give for what the best way to do that would be? |
I did a little digging in the upstream code and found what (I think) are the right spots to make the change. A good option could be to provide a reasonable and configurable default to We’ll take a look at the feasibility of carrying a short-term patch to our containerd package, while we investigate a proper upstream fix. |
@zlangbert Just to follow up here - I created a local patch for |
@zmrow We have switched back to bottlerocket and have had no trouble with 1.0.3. Thank you so much for your help! |
@zlangbert Awesome, glad to hear it. I'll close this issue for now, but feel free to re-open or start a new issue should you see further problems. I've opened #1240 to track the upstream-ing of our |
@zmrow we are also facing issues with rlimit on our elasticsearch clusters, and the current openfiles are shown as (-n) 65536. Is there a way to use the increased limit, the clusters are bootstrapped using eksct. |
I am running a 1.17 EKS cluster that was we recently tried to switch to bottlerocket nodes. We ran into an issue where MySQL 5.7 containers were being OOM killed. This does not happen on the latest AL2 EKS AMI.
After some searching I came across these issues / PRs:
docker-library/mysql#579
kubernetes-sigs/kind#760
containerd/containerd#3201
containerd/containerd#3202 (this one has since been reverted without addressing the reason it was changed in the first place)
I am having trouble actually testing this change on bottlerocket, but it appears that setting LimitNOFILE to something like
1048576
should fix this issue.I also found that the
mysql:8
image does not exhibit this issue.So couple questions:
sudo sheltie
What I expected to happen:
I would expect the mysql container to start and use a resonable amount of memory.
What actually happened:
The mysql container attempts to consume many gigabytes of ram and is OOM killed when there are resource limits placed on it.
How to reproduce the problem:
Run an EKS cluster with bottlerocket nodes and apply this manifest. The mysql pod will go into a restart loop being OOM killed over and over. If you remove the resource limits you can see the pod will take up to 16gb of RAM doing nothing.
Image I'm using:
bottlerocket 1.0.1 / x86_64 in us-west-2 -
ami-0d222c28fd8edb9ff
The text was updated successfully, but these errors were encountered: