Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MySQL 5.7 container OOM kill / changing the NOFILE limit #1136

Closed
zlangbert opened this issue Sep 30, 2020 · 7 comments
Closed

MySQL 5.7 container OOM kill / changing the NOFILE limit #1136

zlangbert opened this issue Sep 30, 2020 · 7 comments
Assignees
Labels
type/bug Something isn't working

Comments

@zlangbert
Copy link

zlangbert commented Sep 30, 2020

I am running a 1.17 EKS cluster that was we recently tried to switch to bottlerocket nodes. We ran into an issue where MySQL 5.7 containers were being OOM killed. This does not happen on the latest AL2 EKS AMI.

After some searching I came across these issues / PRs:
docker-library/mysql#579
kubernetes-sigs/kind#760
containerd/containerd#3201
containerd/containerd#3202 (this one has since been reverted without addressing the reason it was changed in the first place)

I am having trouble actually testing this change on bottlerocket, but it appears that setting LimitNOFILE to something like 1048576 should fix this issue.

I also found that the mysql:8 image does not exhibit this issue.

So couple questions:

  1. This doesn't seem like a bottlerocket bug necessarily, but wanted to at least report the issue and see if it something that could be addressed, especially given the behavior change compared to the AL2 EKS AMI.
  2. Is there any friendly way of making that kind of modification to the containerd unit file without rebuilding the image myself? Even on an already running box I was having trouble trying to test this because I couldn't find a way to edit the file after sudo sheltie

What I expected to happen:
I would expect the mysql container to start and use a resonable amount of memory.

What actually happened:
The mysql container attempts to consume many gigabytes of ram and is OOM killed when there are resource limits placed on it.

How to reproduce the problem:

Run an EKS cluster with bottlerocket nodes and apply this manifest. The mysql pod will go into a restart loop being OOM killed over and over. If you remove the resource limits you can see the pod will take up to 16gb of RAM doing nothing.

apiVersion: v1
kind: Pod
metadata:
  name: mysql-test
  namespace: default
spec:
  containers:
    - name: mysql
      imagePullPolicy: Always
      image: 'mysql:5.7'
      env:
        - name: MYSQL_DATABASE
          value: test
        - name: MYSQL_ROOT_PASSWORD
          value: test
      resources:
        limits:
          cpu: 500m
          memory: 500Mi
        requests:
          cpu: 500m
          memory: 300Mi

Image I'm using:
bottlerocket 1.0.1 / x86_64 in us-west-2 - ami-0d222c28fd8edb9ff

@zmrow
Copy link
Contributor

zmrow commented Oct 15, 2020

Thanks for putting in this issue! The AL2 EKS AMI sets LimitNOFILE=1048576 which is why you’re not seeing the problem there. The issue stems from cri-containerd not providing a mechanism to override the default containerd limit for a container. The default limit is inherited from the containerd process itself (which is configured in this case via our unit file).

Based on the comments in the containerd unit file, setting LimitNOFILE to non-zero can cause performance issues. Given this knowledge, we’d prefer not to change our default setting for containerd since the issue (as of now) looks to be scoped to MySQL and as you mentioned, appears to be fixed in a later version of MySQL. That being said, we’d like to work upstream to help provide better mechanisms to control these limits.

There is also an open Kubernetes issue to make limits configurable.

@zlangbert
Copy link
Author

Hey @zmrow, thanks for the response! That makes sense, and what I suspected was going on. I agree with your position that changing the default in bottlerocket is not a good solution, though I wonder if providing it as a config option would at least make sense so people coming from the AL2 EKS AMI have a way to mitigate this issue?

It would be great to see this supported upstream, though seems like an issue that has been open for a long time without too much movement.

As for the other part of my question, is there a recommend way you would suggest to try and change this in bottlerocket? Seems like this type of thing might be possible by defining a custom container? Or would I definitely need a custom build? Any guidance you can give for what the best way to do that would be?

@zmrow
Copy link
Contributor

zmrow commented Oct 21, 2020

I did a little digging in the upstream code and found what (I think) are the right spots to make the change. A good option could be to provide a reasonable and configurable default to containerd-cri of 1048576. The criService type has a criconfig.Config which contains the PluginConfig. It seems reasonable to add the field to PluginConfig, and then set it in the DefaultConfig() method.

We’ll take a look at the feasibility of carrying a short-term patch to our containerd package, while we investigate a proper upstream fix.

@zmrow
Copy link
Contributor

zmrow commented Oct 23, 2020

@zlangbert Just to follow up here - I created a local patch for cri to be evaluated by the team. If accepted it should be in our next Bottlerocket release. The intent is to upstream this patch, but as mentioned in the PR, there is some additional work to be done.

@samuelkarp samuelkarp added the type/bug Something isn't working label Nov 13, 2020
@bcressey bcressey added the status/needs-triage Pending triage or re-evaluation label Dec 11, 2020
@jhaynes jhaynes removed the status/needs-triage Pending triage or re-evaluation label Dec 11, 2020
@zlangbert
Copy link
Author

@zmrow We have switched back to bottlerocket and have had no trouble with 1.0.3. Thank you so much for your help!

@zmrow
Copy link
Contributor

zmrow commented Dec 14, 2020

@zlangbert Awesome, glad to hear it. I'll close this issue for now, but feel free to re-open or start a new issue should you see further problems.

I've opened #1240 to track the upstream-ing of our cri patch.

@zmrow zmrow closed this as completed Dec 14, 2020
@rverma-dev
Copy link

@zmrow we are also facing issues with rlimit on our elasticsearch clusters, and the current openfiles are shown as (-n) 65536. Is there a way to use the increased limit, the clusters are bootstrapped using eksct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type/bug Something isn't working
Projects
None yet
Development

No branches or pull requests

6 participants