-
-
Notifications
You must be signed in to change notification settings - Fork 787
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JIRA request timeout leads to out-of-memory death #369
Comments
This time around I noticed this in the console:
|
Thank you very much for reporting this and reporting this in such a thorough manner. Much appreciated! A couple of questions:
I have a guess what's causing the spike in memory usage. The material snack component is very resource hungry for some reason (I probably should replace it with my own more easy going implementation at some point). Currently all issues are polled for changes individually (also some room for optimization here :)) so 81 of those snacks are opened at the same time which cause this insane spike. I'll try to provide a fix for this. I have no clue however what's making the requests fail in the first place and if that's an issue on SPs side of things. |
Puh this was a hard nut to crack. My initial guess was wrong. What was causing this extraordinary amount of memory usage is stacktrace.js (stacktracejs/stacktrace.js#222). I think I fixed it by limiting the maximum number of times it can be called in a short succession and by deactivating traces for errors expected to be happening from time to time like this one. The fix will be available with the next release. |
Missed your first comment @johannesjo, sorry about that. I'm not really sure why I get that initial request timeout failure that causes this condition. I can think of two possible causes:
If I restart SP immediately after the crash, the jira sync works, so it really is just some strange temporary failure. You're correct that several jira syncs can happen with no crashes. For instance, I've had SP open since starting work around 4 hours ago, and not a single crash has happened. But two nights ago when I was doing some testing in prep for opening this issue, it would happen much faster (within ~30m) if I re-opened SP immediately after the crash. Anecdotally, it seems that time-until-next-crash seems to decrease the longer I use SP, but that could be my imagination. Also probably worth pointing out that this morning I threw some of my open tickets into the backlog (I don't know if this disables the sync for the ticket, but I figured I'd try it in case #1 really is the problem). I'll let you know if I make it through the workday with no crashes. If that happens, I guess that points to a "too many tickets" problem. I haven't looked at the code, but I imagine that you're disabling jira sync if you catch any request error when trying to sync. Perhaps this should only be triggered when jira explicitly denies your request, e.g. returning a 401. That way it could survive any internet blips. Anyway, probably a topic for a separate issue. This is my jira config, in case it's still useful:
|
lol, speak of the devil... just got the crash. |
Quick Update: The fix should be available in the edge channel now. Could you maybe try again with that? It also includes a little bit more helpful logging for the timeout error. About 1: Jira access is blocked for 401 and for when the timeout error happens. Thinking about it the reason might be that the requests took indeed longer than the 12 seconds defined. I remember building it like this because I personally encountered the case for a particular client when I got a never resolving request instead of a 401 (security through obscurity...) and after a couple of those I got shut out completely. Might make sense to make this timeout configurable though or at least to set a bigger default value. |
Huh. Interesting!
Installed. I will report back. Thanks for such a quick turnaround! |
Just setting a higher timeout value is not as easy as I thought unfortunately, as the timeout value being lower than the delay for polling changes, was what was making it work in the first place. So I attempted to fix & improve the logic of the initial check request, but it's a little bit messy/complicated. I hope it doesn't break anything :D Normally I would just remove this weird workaround but getting blocked from your companies Jira because of your ToDo app is a case I want to avoid at any cost. The changes with the higher timeout value should be on the edge channel soon. Please let me know if this solves the issue with failing requests. |
I can see I got the "request timed out" and a bunch of "blocked" errors and it didn't crash. So that's good. But it seems that each blocked access message results in a |
Just updated to 622, I'll let you know if that fixes the request timeout.
That's exceedingly reasonable. |
Excellent! Thank you very much for letting me know! :) |
Your Environment
Expected Behavior
N/A
Current Behavior
On startup, everything seems fine:
Everything works for a while, but after a bit I'll get this:
This is then followed a few minutes later by a flood of "blocking access" messages:
At this point, 2624512 in the above
ps
output spikes in memory usage (> 4GB). The UI becomes unresponsive at this point. The "polling jira" message shows at the bottom of the screen, and its progress bar is moving, but nothing in the UI responds.If I have the console open, then the debugger will pause execution with a message like "something something to avoid out-of-memory error." At this point it's around 4.4GB RSS. If I just let it run without the debugger, I get a stack trace and core dump.
At this point, 2624512 in the above
ps
crashes. All other processes remain alive. The window empties and it's nothing but a blank white window. I then have to manually kill the process, as trying to close the window does not work.Steps to Reproduce (for bugs)
Logs
Truncated
journalctl
logs from a previous crash:I have a 1GB core dump in case it's useful.
The text was updated successfully, but these errors were encountered: