Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Panic: runtime error. Unsure if resources request/limits are causing it #214

Closed
amalagaura opened this issue Jun 14, 2021 · 5 comments · Fixed by #215
Closed

Panic: runtime error. Unsure if resources request/limits are causing it #214

amalagaura opened this issue Jun 14, 2021 · 5 comments · Fixed by #215
Labels
bug Something isn't working

Comments

@amalagaura
Copy link

Describe the bug
Getting a panic runtime error sometimes.

To Reproduce
I played with resources and limits and maybe this is happening with the resources and limits. But unsure since the pod does not seem to be hitting any limits.

We also applied resources and limits as follows:

resources:
  requests:
    cpu: 50m
    memory: 75Mi
  limits:
     cpu: 500m
     memory: 512Mi

We are running across many different applications. We use name and semver image strategies and an internal docker repo. The error does seem to be happening right after log lines about a project with an image list of 120+ image tags but we have seen the error after other image lists as well.

Expected behavior
Be able to use resources requests

Additional context
Using install.yaml manifest with patches applied to add image secrets for our registry and to change interval to 1m0s in args.

Version
0.9.4

Logs

panic: runtime error: invalid memory address or nil pointer dereference

[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x17224ac]


goroutine 6951 [running]:

github.com/argoproj-labs/argocd-image-updater/pkg/tag.ImageTagList.Add(0xc000f67980, 0xc0009c4f40, 0x0)

/src/argocd-image-updater/pkg/tag/tag.go:98 +0x5c

github.com/argoproj-labs/argocd-image-updater/pkg/registry.(*RegistryEndpoint).GetTags.func1(0xc000d1ba40, 0xc0011a5880, 0x20a0f80, 0xc00000e5b8, 0xc000f10580, 0xc00000e5c0, 0xc000584120, 0xc000f10570, 0xc0003d08c0, 0xc0005ac6f0, ...)

/src/argocd-image-updater/pkg/registry/registry.go:157 +0x2a0

created by github.com/argoproj-labs/argocd-image-updater/pkg/registry.(*RegistryEndpoint).GetTags

/src/argocd-image-updater/pkg/registry/registry.go:121 +0x95e
@amalagaura amalagaura added the bug Something isn't working label Jun 14, 2021
@jannfis
Copy link
Contributor

jannfis commented Jun 15, 2021

Hey, thanks for this report @amalagaura

I think this is not caused by resource limits but rather may be a more-or-less complicated kind of race condition in the code that fetches metadata from tags in a concurrent way. Maybe the CPU throttling will modify execution speed to a certain extant so that the problem surfaces.

This is utterly hard to debug when it's not reproducible. Are you able to share some more of the logs when this panic happens? Does it happen regularly?

Any more information would be greatly appreciated.

@jannfis
Copy link
Contributor

jannfis commented Jun 15, 2021

We just released v0.9.5 with a possible fix for the issue you are seeing.

Will keep this issue open until feedback for 14 days.

@amalagaura
Copy link
Author

amalagaura commented Jun 15, 2021

Thanks @jannfis. So I am able to reproduce this very reliably with low CPU limits. 200m results in a crash within less than 60 seconds. Some of our image repositories have over 100 tags. We are going to start some cleanup procedures. I don't remember why I put such a low limit but it does cause the issue. I added 500m to the ticket because we did see one crash with that setting but it has been reliable after that.

So I will leave at 200m and upgrade to 0.9.5 and let you know if there are still issues.

@jannfis
Copy link
Contributor

jannfis commented Aug 2, 2021

@amalagaura Any update on this issue from your side?

@jannfis
Copy link
Contributor

jannfis commented Aug 2, 2021

Will close for now - feel free to reopen if this still happens for you.

@jannfis jannfis closed this as completed Aug 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants