-
Notifications
You must be signed in to change notification settings - Fork 278
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Panic: runtime error. Unsure if resources request/limits are causing it #214
Comments
Hey, thanks for this report @amalagaura I think this is not caused by resource limits but rather may be a more-or-less complicated kind of race condition in the code that fetches metadata from tags in a concurrent way. Maybe the CPU throttling will modify execution speed to a certain extant so that the problem surfaces. This is utterly hard to debug when it's not reproducible. Are you able to share some more of the logs when this panic happens? Does it happen regularly? Any more information would be greatly appreciated. |
We just released v0.9.5 with a possible fix for the issue you are seeing. Will keep this issue open until feedback for 14 days. |
Thanks @jannfis. So I am able to reproduce this very reliably with low CPU limits. 200m results in a crash within less than 60 seconds. Some of our image repositories have over 100 tags. We are going to start some cleanup procedures. I don't remember why I put such a low limit but it does cause the issue. I added 500m to the ticket because we did see one crash with that setting but it has been reliable after that. So I will leave at 200m and upgrade to 0.9.5 and let you know if there are still issues. |
@amalagaura Any update on this issue from your side? |
Will close for now - feel free to reopen if this still happens for you. |
Describe the bug
Getting a panic runtime error sometimes.
To Reproduce
I played with resources and limits and maybe this is happening with the resources and limits. But unsure since the pod does not seem to be hitting any limits.
We also applied resources and limits as follows:
We are running across many different applications. We use name and semver image strategies and an internal docker repo. The error does seem to be happening right after log lines about a project with an image list of 120+ image tags but we have seen the error after other image lists as well.
Expected behavior
Be able to use resources requests
Additional context
Using install.yaml manifest with patches applied to add image secrets for our registry and to change interval to 1m0s in args.
Version
0.9.4
Logs
The text was updated successfully, but these errors were encountered: