-
-
Notifications
You must be signed in to change notification settings - Fork 21.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add multithreaded Godot Engine Bullet processing. #40073
Conversation
04ffebc
to
137de67
Compare
Here is the proposal |
722b422
to
f6c7b50
Compare
modules/bullet/space_bullet.cpp
Outdated
} | ||
|
||
dispatcher = bulletnew(GodotCollisionDispatcher(collisionConfiguration)); | ||
broadphase = bulletnew(btDbvtBroadphase); | ||
solver = bulletnew(btSequentialImpulseConstraintSolver); | ||
int32_t process_count = OS::get_singleton()->get_processor_count() - 2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you explain your reasoning for hardcoding the number of used threads? I am not sure how exactly multithreading is implemented in bullet and how well it scales, but I am guessing that after some point (e.g. machines with >=16/32) It would make more sense to keep some of them free for stuff like graphics or audio.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm assuming that 2 is the number of hyperthreads that consist of one core. I then subtract it from the list of logical threads the os sees. The main thing is to not step on the gameplay thread (Main).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing this would work fine as long as there are not too many other threads out there. However, this might fail terribly if a dev writes a heavily parallelized game. In the end, it all comes down to how the threads are distributed.
I personally would probably set a max limit of physics threads, but tbh it probably won't ever really matter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we end up integrating this feature, the best would be to have the max number of threads in project settings. As you mentioned, the right value really depends on how threads are used for the whole project.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The project configuration will default to 0 and any integer number above 0 will be the limit of core threads.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, some sort of config option would be nice. However, I am not sure if we want to use integers since the dev has no way of knowing what hardware players have. It's not a perfect solution, but what about using percentages?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure a percent where it is a float number from 0 to 1? Have to see how other settings handle this.
Testing by @pouleyKetchoupp showed that my gut feel of single threaded physics is faster than multicore. Not sure where the bottleneck is. |
Could you link us the produced data and information on how exactly this test works? Really hope that's the problem, because all other reasons that come to mind would be an absolute nightmare to debug. |
Here's the demo scene I was using for tests: Godot target: release_debug So far I've only checked the integrated profiler in Godot to compare the spike in physics, and I got ~1s in single-threaded mode and ~1.2s in multi-threaded mode. It might be because this stress test is not pushing the physics in the areas that are parallelized (there's only 100 bodies at once in this specific test, all added in the same spot to generate lots of contacts). It could be also that Bullet doesn't do such a good job at parallelizing, so it needs more investigation to see if there are some parameters to change that could give better results. So I'm planning to check with more detailed profiling and with more different scenarios. I'll share more updates here. |
100 bodies sounds like a very small problem, which would favor the sequential algorithm. |
The scenario I've tested is a stress test for the solver, and it's important since that part of Bullet supports parallelization. The number of bodies is limited on purpose, so you can specifically identify performance problems in that part of the physics engine. I've started with this scenario because I had a test scene ready for it. I agree though about the fact a scenario like the one you describe is needed too. It will help to stress out other parts of the physics engine that scale up with the number of bodies in the simulation. |
} | ||
|
||
dispatcher = bulletnew(GodotCollisionDispatcher(collisionConfiguration)); | ||
broadphase = bulletnew(btDbvtBroadphase); | ||
solver = bulletnew(btSequentialImpulseConstraintSolver); | ||
int32_t process_count = OS::get_singleton()->get_processor_count() - 2; | ||
solver_pool = bulletnew(btConstraintSolverPoolMt(process_count)); | ||
solver_mutithread = bulletnew(btSequentialImpulseConstraintSolverMt); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
classbtSequentialImpulseConstraintSolverMt is the single island solver. To gain performance may be helpful to tune this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I spent the last half hour reading through the forums on pybullet.org. I only found a handful of mentions of other solvers, and each and every time users were warned against using them since they were badly optimized.
As far as I can tell from what little information I found btSequentialImpulseConstraintSolverMt is the way to go.
Would appreciate it if someone who knows more about bullet could confirm this.
I'm stalled. Not sure where to go with this. |
I'm still up for testing more things when I find time and see where to go from there. Maybe you can convert it to draft for now since it's still in evaluation phase? |
Ok, so I spent some more time looking into Bullet Physics. (I think) I am making steady progress. @fire , could it be that you forgot to update the Godot collision dispatcher? Right now it extends btCollisionDispatcher, which is fine for single-threaded physics, but we will probably need btCollisionDispatcherMt for multithreading. If we want to support both we will probably need a second godot_collision_dispatcher. |
@Ansraer I update[d] the pr. However, I can't test this in 4.0 at all. The 3.2 branch is testable but different. |
I just built this in 4.0 and created a little test scene for it. What I did was to spawn a new cube every 100 ticks and let it drop onto a plane. |
3fdadd4
to
dd27de0
Compare
Action items:
Edited: Done. |
Anyone willing to create a 4.0 sample test? |
185cae4
to
a85ab9d
Compare
a85ab9d
to
dfeb2d0
Compare
@pouleyKetchoupp can use this as a reference but will be closing the pr since I won't be updating it and I didn't see a performance increase. |
Test case godotengine/godot-demo-projects#483.
See #40072.
There's a crash.
#40064 needs to have a proposal written.
Run test case, on Win10 when in 4-5th test it crashes.
Soft body is disabled when using multicore.