Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Poor multithreading performance when calling functions within a thread #58279

Open
qdeanc opened this issue Feb 18, 2022 · 12 comments · May be fixed by #98469
Open

Poor multithreading performance when calling functions within a thread #58279

qdeanc opened this issue Feb 18, 2022 · 12 comments · May be fixed by #98469

Comments

@qdeanc
Copy link

qdeanc commented Feb 18, 2022

Godot version

3.4.2-stable

System information

Windows 10, Intel Core i7 8700-K (6 cores, 12 threads)

Issue description

I've created a scene that tests the performance of Godot's multithreading:

ezgif-5-dcb88393d0
Blue squares show the processing speeds of each thread.
The red square shows the combined processing speed of all threads.
Keys 1-6 start/stop threads. Use the mouse wheel to adjust the scale.

Speed is measured by looping some logic, and finding the number of iterations processed per microsecond.
(Exact speed values are not shown.)

Standard GDScript Operations Work Fine

If iteration logic of each thread uses standard GDScript operations:

while(true):
    some_val += 1

Then the performance seems fine.
Adding a thread does not slow down other threads, and the total processing speed increases.
(Shown above)

Calling Functions Works Poorly

If iteration logic of each thread uses some kind of function:

while(true):
    _increment(some_val)

Then the performance suffers:

ezgif-5-9d2cfac093

Adding a thread slows down other threads, and the total processing speed does not increase.
This happens for:

  • GDScript functions
  • Standard Module C++ functions (such as OpenSimplexNoise.get_noise_2d())
  • Custom Module C++ functions (such as Summator.add() from the tutorial Module)

Calling a Unique Function Per Thread:

If the iteration logic of each thread calls a unique function:

while(true):
    call("_increment" + str(thread_index), some_val)

Then the performance does not suffer as much:

ezgif-5-7697ed75af

Adding a thread slows down other threads, but the total processing speed increases.

Steps to reproduce

Here are the game builds I used for testing:
(Use keys 1-6 and mouse scroll wheel)

Threading_Test_No_Function.zip
Threading_Test_GDScript_Function.zip
Threading_Test_Module_Function.zip
Threading_Test_Custom_Module_Function.zip
Threading_Test_Unique_GDScript_Functions.zip

Download the minimal reproduction project and comment/uncomment lines from the _thread_function() function.
Make sure to export the game as an .exe before testing.

If you want to test the Summator.add() function, you'll need to build the engine from the 3.4.2-stable source code with the Summator module included:
summator.zip

Minimal reproduction project

Multithreading_Test.zip

@Chaosus
Copy link
Member

Chaosus commented Feb 18, 2022

I think it's a duplicate (or at least related) to #7832.

@CrezyDud
Copy link
Contributor

wouldnt it be normal if 2 threads do the same thing that 1 has to do less

@Calinou
Copy link
Member

Calinou commented Feb 18, 2022

Can you test this on Godot 3.2.3, which is the last version before #45618 was merged?

In general, threading on Windows will always be less efficient than it is on Linux, but the multithreading modernization generally made creating threads more expensive than it was before.

@qdeanc
Copy link
Author

qdeanc commented Feb 18, 2022

@Calinou
Just tested everything but the Summator Module function in 3.2.3. Getting the same results.

My concern is with how the use of functions can make the same logic run significantly slower.

@Calinou
Copy link
Member

Calinou commented Feb 18, 2022

@qdeanc You could look into using a C++ profiler to see where the bottleneck is located in the Godot source code. Note that this requires compiling or downloading a debug build as it needs debug symbols.

@qdeanc
Copy link
Author

qdeanc commented Feb 18, 2022

@Calinou Ok, I'll try that

@qdeanc
Copy link
Author

qdeanc commented Feb 18, 2022

@Calinou Ok, I think I did it correctly.
For each test, I had 6 threads running the entire time:
Threading_Profiling

I'm not sure how to read this yet; hopefully someone more familiar with this can take a look

@Calinou Calinou changed the title Poor Multithreading Performance Poor multithreading performance when calling GDScript functions within a thread Mar 31, 2022
@Calinou Calinou changed the title Poor multithreading performance when calling GDScript functions within a thread Poor multithreading performance when calling functions within a thread Mar 31, 2022
@pseidemann
Copy link

hi @qdeanc,
did you gather any insights on performance of the threads compared to the main thread? it seems like the main thread is "running" much faster, but I'm still debugging

@pseidemann
Copy link

it seems like MeshDataTool is much slower when used in a separate thread.
might #56524 be related to all this?

@qdeanc
Copy link
Author

qdeanc commented Nov 28, 2022

@pseidemann I haven't had any issues where one thread ran noticeably faster than another.

My understanding is that GDScript concurrent function calling doesn't work very well at all.
Simultaneously calling a function across multiple threads appears to slow everything down; as if the function can only be called from one thread at a time.

I'm now using C++ in GDNative where this isn't an issue. Everything I've written this way runs super fast and multithreading works great even for native Godot functions that I've tried.

@jchrom
Copy link

jchrom commented May 15, 2023

I imported the original project @qdeanc created to Godot 4.0.2 (but ignored the modules) and reran the tests.

This yielded identical results - adding threads did not increase processing speed if concurrent function calling was used, and even calling a function unique to a thread resulted in weaker, rapidly diminishing speed gains.

In contrast, incrementing by using the + operator is about 100% efficient, i.e. adding 6 threads results in almost 6x speed gain.

Machine: 14× 12th Gen Intel® Core™ i7-12700H, 15.4 GiB memory

I won't be able to debug it myself as C++ is not my forte, but wanted to provide updated test results.

@Amegatron
Copy link

Amegatron commented Sep 6, 2023

Discovered this issue separately while doing some things in Godot v4.1.1.stable.official [bd6af8e].

Here is how to reproduce the problem:

  1. In an empty project, create a new 2D-scene.
  2. Add the following script to the root Node2D: https://gist.github.com/Amegatron/00800d618cd5c9b5ef36b3995086a2e8
  3. Launch the game and wait a bit so it fully loads.
  4. Using keys 1-9 sets the corresponding number of threads. Adding Shift or Ctrl adds 10 to the amount.
  5. Press Enter, and you'll see debug messages in the console, showing the time taken.

Note that no inter-process communication is happening there. All work is fully internal to a thread. Also note that there are even no function calls except for measurement invocations, which are totally not part of the load, but the time still significantly increases with more threads. In my case I'm doing float multplications by a fixed number.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants