Panic: runtime error. Unsure if resources request/limits are causing it #214

amalagaura · 2021-06-14T22:49:13Z

Describe the bug
Getting a panic runtime error sometimes.

To Reproduce
I played with resources and limits and maybe this is happening with the resources and limits. But unsure since the pod does not seem to be hitting any limits.

We also applied resources and limits as follows:

resources:
  requests:
    cpu: 50m
    memory: 75Mi
  limits:
     cpu: 500m
     memory: 512Mi

We are running across many different applications. We use name and semver image strategies and an internal docker repo. The error does seem to be happening right after log lines about a project with an image list of 120+ image tags but we have seen the error after other image lists as well.

Expected behavior
Be able to use resources requests

Additional context
Using install.yaml manifest with patches applied to add image secrets for our registry and to change interval to 1m0s in args.

Version
0.9.4

Logs

panic: runtime error: invalid memory address or nil pointer dereference

[signal SIGSEGV: segmentation violation code=0x1 addr=0x8 pc=0x17224ac]


goroutine 6951 [running]:

github.com/argoproj-labs/argocd-image-updater/pkg/tag.ImageTagList.Add(0xc000f67980, 0xc0009c4f40, 0x0)

/src/argocd-image-updater/pkg/tag/tag.go:98 +0x5c

github.com/argoproj-labs/argocd-image-updater/pkg/registry.(*RegistryEndpoint).GetTags.func1(0xc000d1ba40, 0xc0011a5880, 0x20a0f80, 0xc00000e5b8, 0xc000f10580, 0xc00000e5c0, 0xc000584120, 0xc000f10570, 0xc0003d08c0, 0xc0005ac6f0, ...)

/src/argocd-image-updater/pkg/registry/registry.go:157 +0x2a0

created by github.com/argoproj-labs/argocd-image-updater/pkg/registry.(*RegistryEndpoint).GetTags

/src/argocd-image-updater/pkg/registry/registry.go:121 +0x95e

The text was updated successfully, but these errors were encountered:

jannfis · 2021-06-15T07:18:44Z

Hey, thanks for this report @amalagaura

I think this is not caused by resource limits but rather may be a more-or-less complicated kind of race condition in the code that fetches metadata from tags in a concurrent way. Maybe the CPU throttling will modify execution speed to a certain extant so that the problem surfaces.

This is utterly hard to debug when it's not reproducible. Are you able to share some more of the logs when this panic happens? Does it happen regularly?

Any more information would be greatly appreciated.

jannfis · 2021-06-15T08:18:03Z

We just released v0.9.5 with a possible fix for the issue you are seeing.

Will keep this issue open until feedback for 14 days.

amalagaura · 2021-06-15T13:51:52Z

Thanks @jannfis. So I am able to reproduce this very reliably with low CPU limits. 200m results in a crash within less than 60 seconds. Some of our image repositories have over 100 tags. We are going to start some cleanup procedures. I don't remember why I put such a low limit but it does cause the issue. I added 500m to the ticket because we did see one crash with that setting but it has been reliable after that.

So I will leave at 200m and upgrade to 0.9.5 and let you know if there are still issues.

jannfis · 2021-08-02T09:55:14Z

@amalagaura Any update on this issue from your side?

jannfis · 2021-08-02T09:55:40Z

Will close for now - feel free to reopen if this still happens for you.

amalagaura added the bug Something isn't working label Jun 14, 2021

jannfis mentioned this issue Jun 15, 2021

fix: Fix a possible race condition in metadata retrieval #215

Merged

jannfis closed this as completed in #215 Jun 15, 2021

jannfis reopened this Jun 15, 2021

jannfis closed this as completed Aug 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Panic: runtime error. Unsure if resources request/limits are causing it #214

Panic: runtime error. Unsure if resources request/limits are causing it #214

amalagaura commented Jun 14, 2021

jannfis commented Jun 15, 2021

jannfis commented Jun 15, 2021

amalagaura commented Jun 15, 2021 •

edited

Loading

jannfis commented Aug 2, 2021

jannfis commented Aug 2, 2021

Panic: runtime error. Unsure if resources request/limits are causing it #214

Panic: runtime error. Unsure if resources request/limits are causing it #214

Comments

amalagaura commented Jun 14, 2021

jannfis commented Jun 15, 2021

jannfis commented Jun 15, 2021

amalagaura commented Jun 15, 2021 • edited Loading

jannfis commented Aug 2, 2021

jannfis commented Aug 2, 2021

amalagaura commented Jun 15, 2021 •

edited

Loading