Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

iOS | Crashes on multiple calls to javascript #129

Closed
umangkathuria opened this issue Sep 18, 2019 · 27 comments
Closed

iOS | Crashes on multiple calls to javascript #129

umangkathuria opened this issue Sep 18, 2019 · 27 comments
Assignees
Labels

Comments

@umangkathuria
Copy link

I am running a very basic app with where in I am calling the JavaScript layer thru a loop. Multiple instances of Service are being created and started.

The issue is at times the application crashes with the error message - "There is an unhandled exception!
2019-09-18 13:09:43.551092+0530 LCDemo_ParellelCalls[8720:4039837] Error: Error: ENOENT: no such file or directory, open '/Data/Application/6B1CA06D-2740-40D2-A9DC-790697E5801A/Library/org.liquidplayer.node/_file%3A%2F%2F%2FUsers%2Fumang.kathuria%2FLibrary%2FDeveloper%2FCoreSimulator%2FDevices%2FCBBAD61D-3741-433A-8766-AD0242D71892%2Fdata%2FContainers%2FBundle%2FApplication%2F303DCDAB-3235-4662-98A2-C9EE7BD6CED8%2FLCDemo_ParellelCalls.app/module/liquid.bundle.js'"

Where as sometime the app crashes with the error message- malloc: *** error for object 0x102d85cd0: pointer being freed was not allocated

I am not sure what could be wrong, could you please help ?

@umangkathuria
Copy link
Author

@ericwlange
Copy link
Member

@umangkathuria Will take a look

ericwlange added a commit that referenced this issue Sep 22, 2019
@ericwlange
Copy link
Member

@umangkathuria The latest commit should fix this issue. Please let me know.

@umangkathuria
Copy link
Author

Tried with the last commit- 9335c1e and this is how my pod file looks now-

pod 'LiquidCore', :git => 'https://github.com/LiquidPlayer/LiquidCore.git', :commit => '9335c1e'

I am still facing the crash. Same as mentioned above.

@ericwlange
Copy link
Member

ericwlange commented Sep 23, 2019 via email

@umangkathuria
Copy link
Author

I did do that. So on the first couple of runs, everything goes fine. Its the third or fourth time when we hit the button, we get this error.

I dont know if it is related, but on another sample app, where I am doing the same kind of thing, i.e. calling JS Function 10-15 times through a loop, I now notice that I am getting a different error-"Assertion failed: (map), function New, file /Path-to-project/Pods/LiquidCore/LiquidCoreiOS/LiquidCore/V82JSC/Value.cpp, line 118.
(lldb) "

Do you think it might have been due to this change?
Thanks!

@ericwlange
Copy link
Member

ericwlange commented Sep 23, 2019 via email

@umangkathuria
Copy link
Author

@ericwlange - Please let us know if you've been able to diagnose this issue a bit.
Thanks!!

@ericwlange
Copy link
Member

@umangkathuria It is working for me. The only way I can get it to crash is if I click on the button a bunch of times one after another without waiting for some of the threads to finish. But this is because it launches hundreds of node threads and then eventually runs out of memory. Otherwise it is stable.

@umangkathuria
Copy link
Author

@ericwlange I am attaching a video. The behaviour is just not consistent. I have also noted that the memory consumption is way too much for the application.
Here's what I did-

  • I ran the app, clicked on the "Call JS Layer" button, which runs calls JS service 30 times in a loop.
  • It runs successfully but the memory consumption is way too much. Along to lines on 600-700 MB.
  • Then I click the button again once all the results are back. On second time, the application crashes. Memory usage went up to 1.2GB.

Screen Recording 2019-10-01 at 3.13.07 PM.mov.zip

@ericwlange
Copy link
Member

@umangkathuria I am seeing the memory consumption issue. All of the memory that is reserved by V82JSC, the layer that translates calls from V8 to JavaScriptCore, is getting properly deallocated from what I can see. So the issue is either node or JavaScriptCore is holding onto the memory.

I make one big assumption upon shutdown that may not actually be correct. It is that JSContextGroupRelease() releases all memory held by a context group (Isolate in V8 terms), regardless if there are outstanding references to JavaScript values in C++ or not. When I went through the JSC tests, this seemed to be the case, so I do not release held references on shutdown in an effort to be speedy. I can add some code to walk the heap and release references before shutdown to see if this has an effect.

The second thing that I realized is that JSString references are not connected to a context group, which means that any strings that are referenced in the V82JSC heap at shutdown would not get released, even if my assumption above is true. This is clearly a bug that, depending on the size of the string (node bootstrap code, for instance), could hog up a lot of memory and not release it.

The other possibility is that node itself is allocating some huge amount of memory upon startup and not releasing it. Node is not designed to run in discrete instances. When you run it on a desktop, it has one instance only. Spawning a new process actually calls out to the operating system and launches a whole new node application. The processes then communicate through sockets. I have hacked it so that you can run multiple instances from the same application. If node is relying on the OS to clean up after it, this may not be captured in my hack. This can be verified by running the same application on Android. If the same issue exists, then this is likely the problem. If it does not happen on Android, then it is probably references being held in V82JSC that are blitzed at shutdown, and this assumption is not valid.

So I will chase these down and see what I can find.

As an aside, I now have node 10.15.3 working and have been developing on that branch. It displays the same behavior.

@umangkathuria
Copy link
Author

Thanks @ericwlange for the update. Will be looking forward to see if you are able to find the root cause. Please let me know if there is any input needed from my end.

@ericwlange
Copy link
Member

@umangkathuria I spent much of the weekend banging my head against the wall on this. I found some memory leaks in V82JSC and some other issues that were causing spurious crashes and fixed them. It is definitely more stable now, however the memory usage is still a problem. I can see in the inspector that it does appear that JavaScriptCore does not shut down and is hogging all the memory. I don't yet understand why. I added the code to release all held references, and the thread is exiting correctly. Yet, the JS environment does not go away. So still searching ...

@ericwlange
Copy link
Member

@umangkathuria This may possibly be relevant:
https://stackoverflow.com/questions/21147613/objectivec-and-javascriptcore-will-using-this-method-of-calling-callbacks-cause

I am definitely violating this. Also,
https://developer.apple.com/documentation/javascriptcore/jsmanagedvalue

In particular, the big yellow warning "Important. Do not store a non-managed JSValue object in a native object that is exported to JavaScript. Because a JSValue object references its enclosing JSContext object, this action creates a retain cycle, keeping the context from being deallocated." sounds pretty important.

I have a very small amount of Objective C code, so it will be easy to fix this. I am not sure if this is the culprit but I will try it out tonight.

@umangkathuria
Copy link
Author

@ericwlange were you able to find a fix ?

@ericwlange
Copy link
Member

@umangkathuria Nope. Made no difference. Chasing some other ideas ...

ericwlange added a commit that referenced this issue Oct 13, 2019
- Upgraded to node v10.15.3 from v8.9.3 (#88)
- Fixed massive memory leak on iOS (#129)
@ericwlange
Copy link
Member

@umangkathuria Ok, I found the issue that was causing the biggest memory leak.

There were a couple of retain cycles that existed because delegates were being referenced as strong, rather than weak. Once I finally found that, the memory comes almost fully back to normal. There are, however, still some leaks that I think are coming from allocating node Buffers. So I will continue to chase that. But please try the latest and verify.

Also, I have now fully updated to v10.15.3 of node, so there may be some issues with that. But at the moment, your test app seems to be pretty solid.

@umangkathuria
Copy link
Author

eric, the project doesn't build anymore. Every time on build, I keep getting this is issue that says a header file is missing.

Here's a screenshot.
Screenshot 2019-10-15 at 4 23 52 PM

@ericwlange
Copy link
Member

@umangkathuria Can you try the latest? I just made a bunch of edits to LiquidCore.podspec. It was not passing linting, but is now.

@ericwlange
Copy link
Member

I also had to fix one of the scripts.

To be clear what I did:

$ mkdir -p ~/tmp
$ cd ~/tmp
$ git clone https://github.com/LiquidPlayer/LiquidCore.git
$ (cd LiquidCore; pod lib lint)

(The last line seems to be necessary when using a local dev pod in order to get the prepare_cmd scripts to run, but not necessary for a published pod.)

My podfile:

platform :ios, '11.0'
use_frameworks!

target 'LCDemo_ParellelCalls' do
  pod 'LiquidCore', :path => '~/tmp/LiquidCore/LiquidCore.podspec'
end
$ pod install

I then opened the workspace, built and ran it.

@umangkathuria
Copy link
Author

@ericwlange - The last line in the first step to build the Library locally, seems to be failing for me.
This is the error I get when I run the pod lib lint command-
**_$ (cd LiquidCore; pod lib lint)

-> LiquidCore (0.7.0)
- ERROR | [iOS] unknown: Encountered an unknown error (Could not find a ios simulator (valid values: ). Ensure that Xcode -> Window -> Devices has at least one ios simulator listed or otherwise add one.) during validation.

[!] LiquidCore did not pass validation, due to 1 error.
You can use the --no-clean option to inspect any issue._**

I don't have a device connected, do I need one for this? I do have simulators running though. Is there another step of configuration that I am not seeing?

@ericwlange
Copy link
Member

I don't know what that means.

The only thing you need out of the pod lib lint process is for the scripts to run. Did it generate LiquidCore/LiquidCore/src/ios/gen/polyfill.c and LiquidCore/LiquidCore/src/ios/gen/node_javascript.cc? If so, then don't worry about it for now and just install the pod and see if it builds.

If it did not generate those files, you can do it manually by running the scripts:

$ cd ~/tmp/LiquidCore/LiquidCore/src/ios
$ bash generate_javascript_polyfills.sh
$ bash generate_node_javascript.sh

And then build.

@ericwlange
Copy link
Member

@ericwlange ericwlange self-assigned this Oct 17, 2019
@umangkathuria
Copy link
Author

Okay, yes, I checked for the scripts and found they were there.
Upon building, the build was now a success.

So I am now noticing that the memory consumption has definitely reduced, in fact memory is released after the process ends. For calls in loop from 1 to 30, the application reached memory consumption of ~650 MBs.

The application runs fine a few times, however, when the button is clicked while the last operation is already processing, the memory consumption goes upto 1.25 GBs.
The application still seems to crash at times. It can be triggered if we change the loop to 50 times, or at times on 4th or 5th attempt with lower loop values. The crash errors are also of two kinds-

For scenario where loop count is too high-
There is an unhandled exception!
2019-10-22 11:54:05.246738+0530 LCDemo_ParellelCalls[17497:370154] Error: Error: ENOENT: no such file or directory, open '/home/module/liquid.bundle.js'
2019-10-22 11:54:05.247362+0530 LCDemo_ParellelCalls[17497:370154] Stack: Error: ENOENT: no such file or directory, open '/home/module/liquid.bundle.js'
at [email protected]:438:25
at [email protected]:343:43
at
at global code
at [native code]
at global code@[eval]:1:26
at [native code]
at runInThisContext@[native code]:2:284
at [eval]-wrapper:6:38
at _compile@internal/modules/cjs/loader.js:701:34
at evalScript@internal/bootstrap/node.js:589:35
at startup@internal/bootstrap/node.js:265:19
at bootstrapNodeJSCore@internal/bootstrap/node.js:622:10
at anon@[native code]:2:16
2019-10-22 11:54:05.258454+0530 LCDemo_ParellelCalls[17497:370154] There is an unhandled exception!
2019-10-22 11:54:05.258659+0530 LCDemo_ParellelCalls[17497:370154] Error: null
2019-10-22 11:54:05.262806+0530 LCDemo_ParellelCalls[17497:370154] Stack: undefined

For second kind, where we get the crash on 4th or 5th time button is clicked-

SystemError@internal/errors.js:65:10
NodeError@internal/errors.js:149:12
[email protected]:84:34
createWritableStdioStream@internal/process/stdio.js:167:35
getStdout@internal/process/stdio.js:20:39
console.js:454:18
compile@internal/bootstrap/loaders.js:364:9

I am also attaching a video of the test.
Screen Recording 2019-10-22 at 11.21.43 AM.1.mp4.zip

@umangkathuria
Copy link
Author

Hi @ericwlange.. Did you get a chance to look at this ?

@ericwlange
Copy link
Member

Hi @umangkathuria

I have not had a chance to look at it in detail, but I have had a chance to think about it.

So basically, each node process takes ~20MB of memory which it holds until the process releases. On the surface, this doesn't sound too bad to me. To determine whether this is excessive or not, your experiment should be repeated on Android. Since Android is built directly from the node + v8 source code, it should run as expected by node. The added complexity in iOS is the translation from v8 -> JavaScriptCore. That's where bugs will likely be introduced by LiquidCore. If the memory usage is substantially lower in Android, then this suggests one of two things:

  1. JavaScriptCore is less efficient than v8
  2. V82JSC is holding references longer than it should and not releasing them until process shutdown

If, however, Android also holds 20MB (or, say, more than 10) until the process is finished, then this suggests that this is a normal amount of memory reserved by node.

As an even simpler test, I just ran node on my Mac and then opened a few more terminals and did the same. Each node instance is holding 6MB in idle state:

Screenshot 2019-10-29 at 10 06 38

Now I did not run your test to see how much more memory it utilises, but that would be the next step. In any case, this suggests the lower bound for memory usage should be at least 6MB per instance.

So whether it should be 6MB or 20MB, I am not sure, but in any case you will eventually run out of memory if you keep launching more and more instances before previous ones shut down. To deal with this, I think the best solution is to implement a thread pool. So I would limit the total number of active processes to somewhere between 10 - 20 (probably configurable). If you attempt to launch a 21st, for example, it will block until one of the previous 20 finishes. This will limit memory usage.

As a workaround, and this may sound silly, but you probably should not be launching so many processes at once. I am not quite sure what your use case is, but node is a full operating environment. It is resource intensive. I would suggest limiting the number of new processes you launch and instead reuse existing processes. Startup and shutdown of processes is expensive, and I can't see any real advantage in doing it from a performance perspective.

@ericwlange
Copy link
Member

This is fixed (mostly) in 0.7.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants