Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thread-related crashes when triggering HaxeObject callbacks from Java code in a Lime/OpenFL Extension #983

Closed
bazzisoft opened this issue May 17, 2017 · 15 comments · Fixed by #1534

Comments

@bazzisoft
Copy link

bazzisoft commented May 17, 2017

While upgrading some OpenFL extension libraries to work with the latest OpenFL We've run into an old issue again:

https://github.com/jgranick/openfl-native/issues/145#issuecomment-59254331

The Extension.callbackHandler.post() method appears to run on the Android UI thread and not the Haxe thread. Thereby using this to trigger HaxeObject callbacks which then access OpenFL objects like the stage cause an immediate thread-related crash.

The old fix was to trigger the HaxeObject callbacks via GLSurfaceView.queueEvent(), however looking through the latest source it seems GLSurfaceView is no longer being used and so that function is no longer available.

Is there a new way to have HaxeObject callbacks run on the Haxe (SDLMain?) thread instead of the Android UI thread?

Or alternatively, is there a safe way to cause the desired thread switch in Haxe code? (We tried having the HaxeObject callback start a Timer and triggering UI changes off that, but the crash remains...)

@bazzisoft bazzisoft changed the title Thread-related crashes when calling Haxe from an Android Java Extension Thread-related crashes when triggeing Haxe callbacks from Java code in a Lime/OpenFL Extension May 17, 2017
@bazzisoft bazzisoft changed the title Thread-related crashes when triggeing Haxe callbacks from Java code in a Lime/OpenFL Extension Thread-related crashes when triggering Haxe callbacks from Java code in a Lime/OpenFL Extension May 17, 2017
@bazzisoft bazzisoft changed the title Thread-related crashes when triggering Haxe callbacks from Java code in a Lime/OpenFL Extension Thread-related crashes when triggering HaxeObject callbacks from Java code in a Lime/OpenFL Extension May 17, 2017
@bazzisoft
Copy link
Author

bazzisoft commented May 18, 2017

Update to the above: using a haxe.Timer doesn't prevent the crash, but going through the EventDispatcher system with openfl.Timer does seem to jump it onto the right thread. Doesn't seem quite right, but by using only local variables up to the point the EventDispatcher system is triggered we may be bypassing the issue.

Still would be nice to have a Handler to switch to the Haxe thread for Java > Haxe callbacks though I think?

@jgranick
Copy link
Member

I believe the purpose of Extension.callbackHandler is for making calls to JNI from the Java thread, so if the callbacks are executing on the wrong thread, lets see how we can fix the handler to work properly.

Does it make a difference if it is static, or how do you think it needs to change?

@bazzisoft
Copy link
Author

Looking through the Java source code templates in lime I think there are 2 cases that need to be covered:

  1. JNI calls made from Haxe into Java initially run on the Haxe thread, and usually need to switch to the Android main/UI thread for most native calls to work.

  2. JNI callbacks made from Java back into Haxe via HaxeObject would usually run on the Android main/UI thread (for example if triggered from Activity.onActivityResult()). And in these cases it needs to switch back to the Haxe thread to avoid crashes when manipulating OpenFL objects.

The Extension.callbackHandler appears to be created in GameActivity.onCreate() (see GameActivity.java:99). So I believe it would bind to the Android main/UI thread and provide a solution for (1) above. I think I've seen some examples around recommending the use of Extension.callbackHandler for this case, and changing it to another thread would probably break this.

So my current thinking is that we need another Handler available to extensions, but this time bound to the Haxe thread to solve (2). Looking at the code I think it would have to be created in SDLActivity.java:1031 inside the SDLMain runnable. And then assigned back to the SDLActivity so it can be made available to the Extension class.

I'm not sure how we could implement this "correctly" though given that SDLMain probably shouldn't be tweaking the internals of SDLActivity.

@hypergeome
Copy link

Is there any solution for this yet? I am currently using haxe.Timer.delay call when hanlding callback from openfl extension to update openfl GUI. But not sure if it works.

@player-03
Copy link
Contributor

player-03 commented May 29, 2022

Having looked into this, a Handler isn't the solution. Handler relies on Looper, which is Android's version of an event loop. This is a problem because you can't reasonably have more than one event loop per thread, and we already have MainLoop.

The solution, therefore, is to use the loop we have. You can do this with Timer.delay(), and it does seem to be thread-safe, but it also allocates a bunch of extraneous objects. A more efficient solution is MainLoop.runInMainThread(), which does exactly what it says.

@player-03
Copy link
Contributor

Or not. MainLoop doesn't seem to process those events while Lime is running. Maybe it's waiting for Lime to return?

This is exactly why you don't try to run two event loops on the same thread. Each tries to loop forever, while expecting the other to return at regular intervals.

So... MainLoop isn't our main loop. Ok, good to know. I'll go dig through Lime's code and see what we can use instead.

@hypergeome
Copy link

Actually, I know lime haxe.Timer is not thread safe, but I modified it to be thread safe with Mutex. Just need to modify 2 files:

  1. haxe\lib\lime\7,9,0\src\haxe\Timer.hx
  2. haxe\lib\lime\7,9,0\src\lime_internal\backend\native\NativeApplication.

Also, for extensionkit, modify ExtensionKit.hx, method CreateAndDispatchEvent to use haxe.Timer

Still, this is just my solution for hxcpp. Just FYI.

@player-03
Copy link
Contributor

Right, I was looking at the wrong version of Timer. The version we use on threaded targets is also the version that isn't thread-safe. Yay!

I'm currently sitting on a pull request to make MainLoop compatible with Lime; once I submit it, we can use MainLoop.runInMainThread().

@player-03
Copy link
Contributor

Pull request submitted!

@hypergeome
Copy link

hypergeome commented May 31, 2022

Sorry to ask this here, I dont know where to ask.

So, I have an android app using openfl. That app has a thread. The thread simply does null access.
I built it as debug and having the following flag inside project.xml

<define name="openfl-enable-handle-error" if="debug" /> <haxedef name="HXCPP_CHECK_POINTER" if="debug" /> <!--makes null references cause errors--> <haxedef name="HXCPP_STACK_TRACE" if="debug" /> <haxedef name="HXCPP_STACK_LINE" if="debug" /> <haxedef name="HXCPP_DEBUG_LINK" if="debug" />
When, crash, I get crash log from device and use ndk-stack to trace. It did not trace exactly to the null access line, instead, to thread wrapper call like below

********** Crash dump: **********
Build fingerprint: 'samsung/m11qnnxx/m11q:11/RP1A.200720.012/M115FXXU3BVD1:user/release-keys'
pid: 6534, tid: 6815, name: SDLThread >>> net.ent.contactmanager <<<
signal 6 (SIGABRT), code -1 (SI_QUEUE), fault addr --------
Stack frame 05-31 15:55:47.545 6820 6820 F DEBUG : #00 pc 00065668 /apex/com.android.runtime/lib/bionic/libc.so (abort+172) (BuildId: 13bc715234d0861084dc092396cf9938)
Stack frame 05-31 15:55:47.546 6820 6820 F DEBUG : #1 pc 0203df3b /data/app/~~NjrXS18t4Ysa7VLsTF0gIg==/net.ent.contactmanager-Oial7NoNCEBS9suVVI9BIQ==/lib/arm/libApplicationMain.so (__gnu_cxx::__verbose_terminate_handler()+230): Routine __gnu_cxx::__verbose_terminate_handler() at /usr/local/google/buildbot/src/android/ndk-r15-release/toolchain/gcc/gcc-4.9/libstdc++-v3/libsupc++/vterminate.cc:95
Stack frame 05-31 15:55:47.547 6820 6820 F DEBUG : #2 pc 02012ee1 /data/app/~~NjrXS18t4Ysa7VLsTF0gIg==/net.ent.contactmanager-Oial7NoNCEBS9suVVI9BIQ==/lib/arm/libApplicationMain.so (__cxxabiv1::__terminate(void ()())+4): Routine __cxxabiv1::__terminate(void ()()) at /usr/local/google/buildbot/src/android/ndk-r15-release/toolchain/gcc/gcc-4.9/libstdc++-v3/libsupc++/eh_terminate.cc:47
Stack frame 05-31 15:55:47.547 6820 6820 F DEBUG : #3 pc 02012fe9 /data/app/~~NjrXS18t4Ysa7VLsTF0gIg==/net.ent.contactmanager-Oial7NoNCEBS9suVVI9BIQ==/lib/arm/libApplicationMain.so (std::terminate()+8): Routine std::terminate() at /usr/local/google/buildbot/src/android/ndk-r15-release/toolchain/gcc/gcc-4.9/libstdc++-v3/libsupc++/eh_terminate.cc:57 (discriminator 1)
Stack frame 05-31 15:55:47.547 6820 6820 F DEBUG : #4 pc 0201317b /data/app/~~NjrXS18t4Ysa7VLsTF0gIg==/net.ent.contactmanager-Oial7NoNCEBS9suVVI9BIQ==/lib/arm/libApplicationMain.so (__cxa_throw+110): Routine __cxa_throw at /usr/local/google/buildbot/src/android/ndk-r15-release/toolchain/gcc/gcc-4.9/libstdc++-v3/libsupc++/eh_throw.cc:87
Stack frame 05-31 15:55:47.547 6820 6820 F DEBUG : #5 pc 01e162f8 /data/app/~~NjrXS18t4Ysa7VLsTF0gIg==/net.ent.contactmanager-Oial7NoNCEBS9suVVI9BIQ==/lib/arm/libApplicationMain.so: Routine hx::Throw(Dynamic) at C:/HaxeToolkit/haxe/lib/hxcpp/4,2,1/src/hx/StdLibs.cpp:66 (discriminator 4)
Stack frame 05-31 15:55:47.547 6820 6820 F DEBUG : #6 pc 019b8e64 /data/app/~~NjrXS18t4Ysa7VLsTF0gIg==/net.ent.contactmanager-Oial7NoNCEBS9suVVI9BIQ==/lib/arm/libApplicationMain.so: Routine _hx_run at D:\git\ent\haxe\ContactManager\bin\android\obj/./src/util/_ENTThread/HaxeThread.cpp:129 (discriminator 2)
Stack frame 05-31 15:55:47.547 6820 6820 F DEBUG : #7 pc 019b8f24 /data/app/~~NjrXS18t4Ysa7VLsTF0gIg==/net.ent.contactmanager-Oial7NoNCEBS9suVVI9BIQ==/lib/arm/libApplicationMain.so: Routine __run at D:\git\ent\haxe\ContactManager\bin\android\obj/./src/util/_ENTThread/HaxeThread.cpp:139
Stack frame 05-31 15:55:47.547 6820 6820 F DEBUG : #8 pc 01df2880 /data/app/~~NjrXS18t4Ysa7VLsTF0gIg==/net.ent.contactmanager-Oial7NoNCEBS9suVVI9BIQ==/lib/arm/libApplicationMain.so: Routine hxThreadFunc(void*) at C:/HaxeToolkit/haxe/lib/hxcpp/4,2,1/src/hx/Thread.cpp:267 (discriminator 2)
Stack frame 05-31 15:55:47.547 6820 6820 F DEBUG : #9 pc 000b0567 /apex/com.android.runtime/lib/bionic/libc.so (__pthread_start(void*)+40) (BuildId: 13bc715234d0861084dc092396cf9938)
Stack frame 05-31 15:55:47.547 6820 6820 F DEBUG : #10 pc 00066b37 /apex/com.android.runtime/lib/bionic/libc.so (__start_thread+30) (BuildId: 13bc715234d0861084dc092396cf9938)

dont payattention to ENTThread name. It is actually this
https://github.com/HaxeFoundation/haxe/blob/development/std/cpp/_std/sys/thread/Thread.hx

I copied the content over because I use old haxe (4.0.5) while I need to use eventloop feature of latest haxe Thread.

So, as you can see, it trace right to line 130 in the Thread.hx file, but not my null access line.

So, not sure what I need to do so that ndk-stack does trace to exact cpp code line that cause program crash. My question may be demonstrated like the following question

https://community.haxe.org/t/cpp-target-crashes-without-any-error-messages/2785/4

@openfl openfl deleted a comment from hypergeome May 31, 2022
@player-03
Copy link
Contributor

Check in /data/tombstones for more detailed crash dumps.

$ adb shell ls /data/tombstones
tombstone_01 tombstone_02 tombstone_03 tombstone_04
tombstone_05 tombstone_06
$ adb pull /data/tombstones/tombstone_06 tombstone_06
$ ndk-stack -i tombstone_06 -sym .

@hypergeome
Copy link

Yes. I did use tombstone. But still, no info related to the crash line in the thread_func. So, not sure what I am missing. @@

@player-03
Copy link
Contributor

Then I think it's time to break out lldb. Recall that Export/android/bin is a fully-functional Android project that Android Studio can open, so in theory the instructions should work as written.

@hypergeome
Copy link

Ok. Thank you.

@player-03
Copy link
Contributor

Oh wait, I just went and re-read your use case. All this time I thought you were debugging a specific error that only happened on Android, rather than trying to figure out how to approach the problem for future reference.

For future reference, I suggest debugging the cpp or hl target if you can reproduce the crash. Compilation is faster, and you have more options for debuggers. VS Code even has extensions to attach a debugger to a running program, though I've never gotten those to work personally.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants