Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

3.2.2 Dangling Variant Crash starting Jetpaca #39776

Closed
lawnjelly opened this issue Jun 23, 2020 · 14 comments
Closed

3.2.2 Dangling Variant Crash starting Jetpaca #39776

lawnjelly opened this issue Jun 23, 2020 · 14 comments

Comments

@lawnjelly
Copy link
Member

lawnjelly commented Jun 23, 2020

Godot version:

3.2.2 beta2 - RC3
I have bisected this crash was introduced with commit 5d5848d.

OS/device including version:

Linux Mint 18.3
Intel core i5 7500T
HD Graphics 630

Issue description:

Crash on starting Jetpaca. (My jetpaca zip file date is 2 March 2020)
This occurs on release and debug builds.

Call stack:

memdelete<Thread>
_Thread::wait_to_finish
MethodBind0R<_Thread, Variant>::call
Object::call
MessageQueue::_call_function
MessageQueue::flush
SceneTree::iteration
Main::iteration
OS_X11::run
main

Steps to reproduce:
Use 3.2.2 beta2 to RC3 to run Jetpaca project.
https://github.com/KOBUGE-Games/jetpaca

Akien reports it did not occur for him, so it may be a race condition.

@akien-mga
Copy link
Member

I can't reproduce it on Mageia 8 (also with HD Graphics 630), neither with 3.2.2 RC 2 nor a current debug build from 3.2 (7ec072f).

I tried cleaning up my ~/.local/share/godot/app_userdata/Jetpaca to be sure but it still works fine.

@lawnjelly
Copy link
Member Author

lawnjelly commented Jun 23, 2020

I've bisected it to this PR:
#38119
Commit 5d5848d
new_dangling_variant_fix

cc @RandomShaper

Maybe this is the race condition mentioned in the PR. It does sound like a race condition if it is occurring for me but not for Akien.

@lawnjelly lawnjelly changed the title 3.2.2 Crash starting Jetpaca 3.2.2 Dangling Variant Crash starting Jetpaca Jun 23, 2020
@akien-mga
Copy link
Member

Maybe this is the race condition mentioned in the PR. It does sound like a race condition if it is occurring for me but not for Akien.

For reference, here's how that race condition was described:

All that said, in the name of performance there's still one possible case of undefined behavior: in multithreaded scripts there would be a race condition between a thread freeing an Object and another one trying to operate on it. The latter may not realize the Object has been freed soon enough. But that's a case of bad scripting that was never supported anyway.

There's one use of threads in Jetpaca at https://github.com/KOBUGE-Games/jetpaca/blob/master/shared/music.gd#L24, that might be the issue. (You can check if disabling it fixes the crash.)

@lawnjelly
Copy link
Member Author

Yup, can confirm commenting out line 24 (and the one before creating the thread) fixes it.

@RandomShaper
Copy link
Member

Could you try inverting the order of these two lines (

	call_deferred("_change_music_to", loader.get_resource())
	options.thread.call_deferred("wait_to_finish")

) so they look like this (

	options.thread.call_deferred("wait_to_finish")
	call_deferred("_change_music_to", loader.get_resource())

) and see what happens?

@RandomShaper
Copy link
Member

To add a little piece of insight, in general it's not safe to access any object in Godot without proper synchronization. Some containers, etc. are thread-safe by themselves, but it's not the case for the universal Object base class.

That's true even before the dangling Variant fix. However, since it adds a little bit of additional bookkeeping, the chance of getting UB/crashes in the face of a lack of proper sync may be a bit bigger.

@lawnjelly
Copy link
Member Author

Still crashes after inverting order.

@RandomShaper
Copy link
Member

Thanks for testing. I'll give it a deeper look later today.

@lawnjelly
Copy link
Member Author

To add a little piece of insight, in general it's not safe to access any object in Godot without proper synchronization. Some containers, etc. are thread-safe by themselves, but it's not the case for the universal Object base class.

That's true even before the dangling Variant fix. However, since it adds a little bit of additional bookkeeping, the chance of getting UB/crashes in the face of a lack of proper sync may be a bit bigger.

Yup, as you say maybe the game is doing something unsafe anyway in the scripts and this is just making it more likely to occur.

@RandomShaper
Copy link
Member

I've not been able to reproduce the crash on Windows.

Tried both with and without debugger attached, debug and non-debug build.

I even tried being super unsafe in the usage of the loading thread.

@akien-mga
Copy link
Member

Still crashing in 3.2 a6ef6b1 on Linux on an Intel i7-8705G.

gdb backtrace:

instanced scene         [Control:48435] res://menu/stage_select.tscn
[New Thread 0x7fffe8058700 (LWP 259984)]
Adding levels...
Adding level    True    res://stages/world_1/intro.tscn
Adding level    True    res://stages/world_1/forest_fun.tscn
Adding level    True    res://stages/world_1/roller_coaster.tscn
Adding level    True    res://stages/gcon/demo_gcon_004.tscn
Adding level    True    res://stages/gcon/demo_gcon_006.tscn
added to tree
[Thread 0x7fffe8058700 (LWP 259984) exited]
[New Thread 0x7fffe8058700 (LWP 259986)]
load: res://stages/world_1/forest_fun.tscn
LOAD: FREEING
LOAD: START_PROGRESS
[Thread 0x7fffe8058700 (LWP 259986) exited]
--Type <RET> for more, q to quit, c to continue without paging--

Thread 1 "godot-3.2" received signal SIGSEGV, Segmentation fault.
0x000000000143db6c in memdelete<Thread> (p_class=0x1490) at ./core/os/memory.h:117
117                     p_class->~T();
(gdb) bt
#0  0x000000000143db6c in memdelete<Thread> (p_class=0x1490) at ./core/os/memory.h:117
#1  0x0000000003b3e059 in _Thread::wait_to_finish (this=0x787f460) at core/bind/core_bind.cpp:2814
#2  0x0000000001512377 in MethodBind0R<Variant>::call (this=0x5591210, p_object=0x787f460, p_args=0x0, p_arg_count=0, r_error=...) at ./core/method_bind.gen.inc:237
#3  0x00000000038e9190 in Object::call (this=0x787f460, p_method=..., p_args=0x0, p_argcount=0, r_error=...) at core/object.cpp:922
#4  0x00000000038dfde2 in MessageQueue::_call_function (this=0x55fca70, p_target=0x787f460, p_func=..., p_args=0x7ffff621c080, p_argcount=0, p_show_error=true) at core/message_queue.cpp:250
#5  0x00000000038e00a9 in MessageQueue::flush (this=0x55fca70) at core/message_queue.cpp:297
#6  0x0000000002b1f654 in SceneTree::idle (this=0x5953420, p_time=0.0166666675) at scene/main/scene_tree.cpp:537
#7  0x0000000001463a82 in Main::iteration () at main/main.cpp:2107
#8  0x0000000001432442 in OS_X11::run (this=0x7fffffffcf10) at platform/x11/os_x11.cpp:3233
#9  0x00000000014226bc in main (argc=1, argv=0x7fffffffd738) at platform/x11/godot_x11.cpp:56

Steps to reproduce:

  • Load Jetpaca's stage select
  • Select level 2
  • Go back (Escape)
  • Select level 2 again
  • Go back (Escape)
  • Repeat ad lib until it crashes

Sometimes I experienced it relatively easily by selecting Level 1 first, then exiting then Level 2 (crash on load), but other times I had to repeat many times on a single level to finally get it to crash.

@qarmin
Copy link
Contributor

qarmin commented Nov 22, 2020

Crash still happens with 3.2.4.beta.custom_build. 279f4fb
The easiest fix is to change:

options.thread.call_deferred("wait_to_finish")

to

options.thread.wait_to_finish()

Address sanitizer log

==157563==ERROR: AddressSanitizer: heap-use-after-free on address 0x613000066e78 at pc 0x000011d87c93 bp 0x7ffff256d4b0 sp 0x7ffff256d4a8
WRITE of size 8 at 0x613000066e78 thread T0
    #0 0x11d87c92 in Dictionary::_unref() const /mnt/Miecz/godot3.2/core/dictionary.cpp:189:5
    #1 0x11d8a2e4 in Dictionary::~Dictionary() /mnt/Miecz/godot3.2/core/dictionary.cpp:284:2
    #2 0x122c7282 in Variant::clear() /mnt/Miecz/godot3.2/core/variant.cpp:1142:48
    #3 0x122c464a in Variant::reference(Variant const&) /mnt/Miecz/godot3.2/core/variant.cpp:921:4
    #4 0x122de6fe in Variant::operator=(Variant const&) /mnt/Miecz/godot3.2/core/variant.cpp:2588:3
    #5 0x12e9c364 in _Thread::wait_to_finish() /mnt/Miecz/godot3.2/core/bind/core_bind.cpp:2833:11
    #6 0x13089a61 in MethodBind0R<_Thread, Variant>::call(Object*, Variant const**, int, Variant::CallError&) /mnt/Miecz/godot3.2/./core/method_bind.gen.inc:325:17
    #7 0x11fe9bbb in Object::call(StringName const&, Variant const**, int, Variant::CallError&) /mnt/Miecz/godot3.2/core/object.cpp:919:17
    #8 0x11fb9c99 in MessageQueue::_call_function(Object*, StringName const&, Variant const*, int, bool) /mnt/Miecz/godot3.2/core/message_queue.cpp:250:12
    #9 0x11fbaf43 in MessageQueue::flush() /mnt/Miecz/godot3.2/core/message_queue.cpp:297:6
    #10 0xba96514 in SceneTree::iteration(float) /mnt/Miecz/godot3.2/scene/main/scene_tree.cpp:483:33
    #11 0x1ef4736 in Main::iteration() /mnt/Miecz/godot3.2/main/main.cpp:2080:45
    #12 0x1db92a0 in OS_X11::run() /mnt/Miecz/godot3.2/platform/x11/os_x11.cpp:3410:7
    #13 0x1d29f94 in main /mnt/Miecz/godot3.2/platform/x11/godot_x11.cpp:56:6
    #14 0x7f7ab99d70b2 in __libc_start_main /build/glibc-ZN95T4/glibc-2.31/csu/../csu/libc-start.c:308:16
    #15 0x1c7f2ad in _start (/usr/bin/godotls+0x1c7f2ad)

0x613000066e78 is located 312 bytes inside of 360-byte region [0x613000066d40,0x613000066ea8)
freed by thread T0 here:
    #0 0x1cf776d in free (/usr/bin/godotls+0x1cf776d)
    #1 0x126d6c69 in Memory::free_static(void*, bool) /mnt/Miecz/godot3.2/core/os/memory.cpp:178:3
    #2 0x1decdfc in void memdelete<Reference>(Reference*) /mnt/Miecz/godot3.2/./core/os/memory.h:119:2
    #3 0x1dec72e in Ref<Reference>::unref() /mnt/Miecz/godot3.2/./core/reference.h:279:4
    #4 0x120fc05e in RefPtr::unref() /mnt/Miecz/godot3.2/core/ref_ptr.cpp:90:7
    #5 0x122c7220 in Variant::clear() /mnt/Miecz/godot3.2/core/variant.cpp:1129:20
    #6 0x1d2b988 in Variant::~Variant() /mnt/Miecz/godot3.2/./core/variant.h:444:29
    #7 0x11d8d76d in Pair<Variant const*, Variant>::~Pair() /mnt/Miecz/godot3.2/./core/pair.h:35:8
    #8 0x11d8d704 in List<Pair<Variant const*, Variant>, DefaultAllocator>::Element::~Element() /mnt/Miecz/godot3.2/./core/list.h:51:8
    #9 0x11d8d694 in void memdelete_allocator<List<Pair<Variant const*, Variant>, DefaultAllocator>::Element, DefaultAllocator>(List<Pair<Variant const*, Variant>, DefaultAllocator>::Element*) /mnt/Miecz/godot3.2/./core/os/memory.h:128:13
    #10 0x11d8d501 in List<Pair<Variant const*, Variant>, DefaultAllocator>::_Data::erase(List<Pair<Variant const*, Variant>, DefaultAllocator>::Element const*) /mnt/Miecz/godot3.2/./core/list.h:174:4
    #11 0x11d8c989 in List<Pair<Variant const*, Variant>, DefaultAllocator>::erase(List<Pair<Variant const*, Variant>, DefaultAllocator>::Element const*) /mnt/Miecz/godot3.2/./core/list.h:369:22
    #12 0x11d8c6e8 in List<Pair<Variant const*, Variant>, DefaultAllocator>::clear() /mnt/Miecz/godot3.2/./core/list.h:405:4
    #13 0x11d8c4b4 in List<Pair<Variant const*, Variant>, DefaultAllocator>::~List() /mnt/Miecz/godot3.2/./core/list.h:726:3
    #14 0x11d94c76 in OrderedHashMap<Variant, Variant, VariantHasher, VariantComparator, (unsigned char)3, (unsigned char)8>::~OrderedHashMap() /mnt/Miecz/godot3.2/./core/ordered_hash_map.h:47:7
    #15 0x11d94bfd in DictionaryPrivate::~DictionaryPrivate() /mnt/Miecz/godot3.2/core/dictionary.cpp:37:8
    #16 0x11d8c094 in void memdelete<DictionaryPrivate>(DictionaryPrivate*) /mnt/Miecz/godot3.2/./core/os/memory.h:117:13
    #17 0x11d87c74 in Dictionary::_unref() const /mnt/Miecz/godot3.2/core/dictionary.cpp:187:3
    #18 0x11d8a2e4 in Dictionary::~Dictionary() /mnt/Miecz/godot3.2/core/dictionary.cpp:284:2
    #19 0x122c7282 in Variant::clear() /mnt/Miecz/godot3.2/core/variant.cpp:1142:48
    #20 0x122c464a in Variant::reference(Variant const&) /mnt/Miecz/godot3.2/core/variant.cpp:921:4
    #21 0x122de6fe in Variant::operator=(Variant const&) /mnt/Miecz/godot3.2/core/variant.cpp:2588:3
    #22 0x12e9c364 in _Thread::wait_to_finish() /mnt/Miecz/godot3.2/core/bind/core_bind.cpp:2833:11
    #23 0x13089a61 in MethodBind0R<_Thread, Variant>::call(Object*, Variant const**, int, Variant::CallError&) /mnt/Miecz/godot3.2/./core/method_bind.gen.inc:325:17
    #24 0x11fe9bbb in Object::call(StringName const&, Variant const**, int, Variant::CallError&) /mnt/Miecz/godot3.2/core/object.cpp:919:17
    #25 0x11fb9c99 in MessageQueue::_call_function(Object*, StringName const&, Variant const*, int, bool) /mnt/Miecz/godot3.2/core/message_queue.cpp:250:12
    #26 0x11fbaf43 in MessageQueue::flush() /mnt/Miecz/godot3.2/core/message_queue.cpp:297:6
    #27 0xba96514 in SceneTree::iteration(float) /mnt/Miecz/godot3.2/scene/main/scene_tree.cpp:483:33
    #28 0x1ef4736 in Main::iteration() /mnt/Miecz/godot3.2/main/main.cpp:2080:45
    #29 0x1db92a0 in OS_X11::run() /mnt/Miecz/godot3.2/platform/x11/os_x11.cpp:3410:7

previously allocated by thread T0 here:
    #0 0x1cf79ed in malloc (/usr/bin/godotls+0x1cf79ed)
    #1 0x126d516a in Memory::alloc_static(unsigned long, bool) /mnt/Miecz/godot3.2/core/os/memory.cpp:82:14
    #2 0x126d4f64 in operator new(unsigned long, char const*) /mnt/Miecz/godot3.2/core/os/memory.cpp:42:9
    #3 0x1213b4c0 in Object* ClassDB::creator<_Thread>() /mnt/Miecz/godot3.2/./core/class_db.h:143:10
    #4 0x11cf74e7 in ClassDB::instance(StringName const&) /mnt/Miecz/godot3.2/core/class_db.cpp:559:9
    #5 0x21f2a74 in GDScriptNativeClass::instance() /mnt/Miecz/godot3.2/modules/gdscript/gdscript.cpp:82:9
    #6 0x21f24db in GDScriptNativeClass::_new() /mnt/Miecz/godot3.2/modules/gdscript/gdscript.cpp:69:14
    #7 0x2286031 in MethodBind0R<GDScriptNativeClass, Variant>::call(Object*, Variant const**, int, Variant::CallError&) /mnt/Miecz/godot3.2/./core/method_bind.gen.inc:325:17
    #8 0x11fe9bbb in Object::call(StringName const&, Variant const**, int, Variant::CallError&) /mnt/Miecz/godot3.2/core/object.cpp:919:17
    #9 0x1234c7e8 in Variant::call_ptr(StringName const&, Variant const**, int, Variant*, Variant::CallError&) /mnt/Miecz/godot3.2/core/variant_call.cpp:1129:14
    #10 0x2470c91 in GDScriptFunction::call(GDScriptInstance*, Variant const**, int, Variant::CallError&, GDScriptFunction::CallState*) /mnt/Miecz/godot3.2/modules/gdscript/gdscript_function.cpp:1088:12
    #11 0x2225ee1 in GDScriptInstance::call(StringName const&, Variant const**, int, Variant::CallError&) /mnt/Miecz/godot3.2/modules/gdscript/gdscript.cpp:1238:21
    #12 0x11fe95cf in Object::call(StringName const&, Variant const**, int, Variant::CallError&) /mnt/Miecz/godot3.2/core/object.cpp:898:26
    #13 0x1234c7e8 in Variant::call_ptr(StringName const&, Variant const**, int, Variant*, Variant::CallError&) /mnt/Miecz/godot3.2/core/variant_call.cpp:1129:14
    #14 0x2470d59 in GDScriptFunction::call(GDScriptInstance*, Variant const**, int, Variant::CallError&, GDScriptFunction::CallState*) /mnt/Miecz/godot3.2/modules/gdscript/gdscript_function.cpp:1091:12
    #15 0x22270e6 in GDScriptInstance::_ml_call_reversed(GDScript*, StringName const&, Variant const**, int) /mnt/Miecz/godot3.2/modules/gdscript/gdscript.cpp:1269:13
    #16 0x222744b in GDScriptInstance::call_multilevel_reversed(StringName const&, Variant const**, int) /mnt/Miecz/godot3.2/modules/gdscript/gdscript.cpp:1276:3
    #17 0xb936b96 in Node::_notification(int) /mnt/Miecz/godot3.2/scene/main/node.cpp:149:28
    #18 0x20e234e in Node::_notificationv(int, bool) /mnt/Miecz/godot3.2/./scene/main/node.h:46:2
    #19 0x11fd8bd6 in Object::notification(int, bool) /mnt/Miecz/godot3.2/core/object.cpp:929:2
    #20 0xb940b9d in Node::_propagate_ready() /mnt/Miecz/godot3.2/scene/main/node.cpp:196:3
    #21 0xb94057a in Node::_propagate_ready() /mnt/Miecz/godot3.2/scene/main/node.cpp:188:21
    #22 0xb96264e in Node::_set_tree(SceneTree*) /mnt/Miecz/godot3.2/scene/main/node.cpp:2571:4
    #23 0xba956eb in SceneTree::init() /mnt/Miecz/godot3.2/scene/main/scene_tree.cpp:463:8
    #24 0x1db8ff8 in OS_X11::run() /mnt/Miecz/godot3.2/platform/x11/os_x11.cpp:3397:13
    #25 0x1d29f94 in main /mnt/Miecz/godot3.2/platform/x11/godot_x11.cpp:56:6
    #26 0x7f7ab99d70b2 in __libc_start_main /build/glibc-ZN95T4/glibc-2.31/csu/../csu/libc-start.c:308:16

SUMMARY: AddressSanitizer: heap-use-after-free /mnt/Miecz/godot3.2/core/dictionary.cpp:189:5 in Dictionary::_unref() const

@akien-mga
Copy link
Member

Still occasionally crashing in a18df71. I'm wary of working it around in Jetpaca's code as this might bite other users who wouldn't expect the engine to crash on them.

If the problematic code in Jetpaca is not supported, we need to make it very clear in the documentation that such misuse can lead to crashes.

[1] /lib64/libc.so.6(+0x3b580) [0x7f2dc9d80580] (??:0)
[2] void memdelete<Thread>(Thread*) (/home/akien/Projects/godot/godot-3.2/./core/os/memory.h:117)
[3] _Thread::wait_to_finish() (/home/akien/Projects/godot/godot-3.2/core/bind/core_bind.cpp:2833)
[4] MethodBind0R<Variant>::call(Object*, Variant const**, int, Variant::CallError&) (/home/akien/Projects/godot/godot-3.2/./core/method_bind.gen.inc:238 (discriminator 4))
[5] Object::call(StringName const&, Variant const**, int, Variant::CallError&) (/home/akien/Projects/godot/godot-3.2/core/object.cpp:919 (discriminator 1))
[6] MessageQueue::_call_function(Object*, StringName const&, Variant const*, int, bool) (/home/akien/Projects/godot/godot-3.2/core/message_queue.cpp:250)
[7] MessageQueue::flush() (/home/akien/Projects/godot/godot-3.2/core/message_queue.cpp:299)
[8] SceneTree::idle(float) (/home/akien/Projects/godot/godot-3.2/scene/main/scene_tree.cpp:538)
[9] Main::iteration() (/home/akien/Projects/godot/godot-3.2/main/main.cpp:2104)
[10] OS_X11::run() (/home/akien/Projects/godot/godot-3.2/platform/x11/os_x11.cpp:3608)
[11] godot-3.2(main+0xfa) [0x138c54c] (/home/akien/Projects/godot/godot-3.2/platform/x11/godot_x11.cpp:57)
[12] /lib64/libc.so.6(__libc_start_main+0xea) [0x7f2dc9d6ce4a] (??:0)
[13] godot-3.2(_start+0x2a) [0x138c3aa] (??:?)

@akien-mga
Copy link
Member

The crash seems fixed by #45618 🎉

Now it throws a non-crashing error instead, rightly pointing out that the Jetpaca code is doing something fishy (as we found out previouly since changing Jetpaca code could fix the crash):

ERROR: ~_Thread: Reference to a Thread object was lost while the thread is still running...
   At: core/bind/core_bind.cpp:2847.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants