Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimized physics object spawn time #39726

Merged
merged 1 commit into from
Jul 7, 2020

Conversation

AndreaCatania
Copy link
Contributor

  • Optimized physics object spawn time
  • Optimized shape usage when the shape is not scaled

fixes: #30027

The old bullet integration was 2 times slower compared to Godot physics when adding new bodies.
With this PR the timings got improved drastically (in the order of 1000 times).

(All measurements were done with release debug)

Before optimization:

Bullet Physics Add 4000 Body time: 66 seconds
Godot Physics Add 4000 Body time: 28 seconds

After optimization:

Bullet Physics Add 4000 Body time: ~0.07 seconds
Godot Physics Add 4000 Body time: 28 seconds

Bullet Physics Add 100000 Body time: ~6.5 seconds
Godot Physics Add 100000 Body time: Crash after ~4500 bodies

The tests got executed using:

extends Node

var shapes_per_frame := 4000
var shape := BoxShape3D.new()
var t1

func _ready():
	t1 = OS.get_unix_time()
	_do_test()


func _physics_process(delta):
	call_deferred("quit")


func quit():
	# Quit next frame.
	var t2 = OS.get_unix_time()
	print("Done in: ", t2 - t1)
	get_tree().quit()


func _do_test():
	print("Test start")
	for i in range(0, shapes_per_frame):
		var rb = RigidBody3D.new()
		var c = CollisionShape3D.new()
		c.set_shape(shape)
		rb.add_child(c)
		add_child(rb)
		if i % 50 == 0:
			print("Added: ", i)

@realkotob
Copy link
Contributor

realkotob commented Jun 21, 2020

I took a look at the changes and none of it seems to depend on 4.0 changes, so I believe this can apply to 3.2.x as well.

After this is merged I can make a PR for 3.2 if needed, ping me if it's needed.

@aaronfranke
Copy link
Member

This is a massive improvement, thank you for making this.

For testing, I rebased this on the 3.2 branch and opened the Voxel demo project (but with VSync turned off). Each frame the demo generates 1 chunk. As the world generates, the FPS gradually becomes lower and lower.

  • 3.2 branch: At render distance 6, min FPS is around 34. At render distance 10, min FPS is around 7.

  • With this commit on 3.2: At render distance 6, min FPS is around 127. At render distance 10, min FPS is around 14.

So the performance in my case is significantly increased, and the gains are exponentially higher when there are fewer chunks in total (and therefore the slowness mostly comes from generating new ones rather than processing existing ones).

Also, the above is comparing a debug build with a debug build. With 3.2 release builds, at render distance 6, min FPS is around 120, and at render distance 10, min FPS is around 25. So my guess is that there will be similar % improvements with release builds, which would result in really high FPS (maybe ~400 min FPS at render distance 6 and ~50-60 at render distance 10).

Copy link
Contributor

@madmiraal madmiraal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've not actually tested the code, because it seems like many others have already, but aside from the relatively minor changes I've suggested, I think it looks good. 👍


ShapeBullet::~ShapeBullet() {}
ShapeBullet::~ShapeBullet() {
bulletdelete(default_shape);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should check that default_shape is not a nullptr before deleting it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bulletdelete uses C++ delete so there is no point in checking for nullptr before calling it

ShapeBullet::~ShapeBullet() {}
ShapeBullet::~ShapeBullet() {
bulletdelete(default_shape);
default_shape = nullptr;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is no reason for setting values in the destructor.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason I added it is for code style coherency. It doesn't cause any issue, so I prefer have it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bulletdelete sets the pointer to nullptr after deleting it so this would be redundant. I looked for other uses of bulletdelete and I didn't see anyone setting the pointer to nullptr afterwards (modules/bullet/space_bullet.cpp for example)

void ShapeBullet::destroy_bt_shape(btCollisionShape *p_shape) const {
if (p_shape != default_shape && p_shape != old_default_shape) {
bulletdelete(p_shape);
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, we should ensure p_shape is not a nullptr before deleting it.


void ShapeBullet::destroy_bt_shape(btCollisionShape *p_shape) const {
if (p_shape != default_shape && p_shape != old_default_shape) {
bulletdelete(p_shape);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far as I can tell, old_default_shape is always a nullptr.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

During the shape refresh, done when a shape is changed, the old_default_shape contains the pointer of the old shape. This function is called in that phase to destroy the old shapes.

// At this point now one has the old default shape; just delete it.
bulletdelete(old_default_shape);
old_default_shape = nullptr;
}
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It appears as if old_default_shape is only used (and cleared) here; so I would just make it a local variable to this function.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variable old_default_shape is used to avoid destroy the old default shape by the callbacks triggered by owner->shape_changed(owner->find_shape(this));

virtual void reload_shapes();
void reload_shapes();
bool need_reload_shapes() const { return need_shape_reload; }
virtual void do_reload_shapes();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As far a I can tell, the need_reload_shapes() function is never used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it's the getter of a variable, it's there for code completeness.

bool need_reload_body() const {
return need_body_reload;
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, need_reload_body() doesn't appear to be used.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but it's the getter of a variable, it's there for code completeness.

@@ -79,7 +79,7 @@ btTransform CollisionObjectBullet::ShapeWrapper::get_adjusted_transform() const
}

void CollisionObjectBullet::ShapeWrapper::claim_bt_shape(const btVector3 &body_scale) {
if (!bt_shape) {
if (bt_shape == nullptr) {
if (active) {
bt_shape = shape->create_bt_shape(scale * body_scale);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A matter of taste, but I prefer the original if (!bt_shape), or better yet, the opposite: if (bt_shape) return; to avoid needing to indent the code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The result is the same, but in this way it's much more clear that it's a pointer. The indentation just make the code more clear, in this case, for this reason I did it so.

@@ -88,6 +88,13 @@ void CollisionObjectBullet::ShapeWrapper::claim_bt_shape(const btVector3 &body_s
}
}

void CollisionObjectBullet::ShapeWrapper::release_bt_shape() {
if (bt_shape != nullptr) {
shape->destroy_bt_shape(bt_shape);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, I prefer if (!bt_shape) return;

shpWrapper->claim_bt_shape(body_scale);
mainShape = shpWrapper->bt_shape;
shapes_ptr[0].claim_bt_shape(body_scale);
mainShape = shapes_ptr[0].bt_shape;
main_shape_changed();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of shapes_ptr[0]., why not use shapes_ptr->?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the pointer of an array, and just above is used as so. I'm not using the dereferencing syntax to keep the code coherent and more clear.

@akien-mga akien-mga merged commit bd3a468 into godotengine:master Jul 7, 2020
@akien-mga
Copy link
Member

Thanks!

I assume this might be worth backporting to the 3.2 branch, would you make a cherry-pick PR?

@AndreaCatania AndreaCatania deleted the add_body_impr_physics branch July 7, 2020 15:07
@AndreaCatania
Copy link
Contributor Author

Yes, it would be a good backport. @asheraryam are you still up doing the back port?

need_shape_reload = true;
}

void RigidCollisionObjectBullet::do_reload_shapes() {
if (mainShape && mainShape->isCompound()) {
// Destroy compound
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Postponing the reloading of shapes until later causes crashes if those shapes are needed before do_reload_shapes() is run. #40311 and #40840 are just two simple examples. Even with additional triggers to call do_reload_shapes() (e.g. #40886 (or the suggested alternative) and #40252) the crashes still happen.

reload_shapes() is called for numerous reasons:

  • Creating a RigidBody3D
  • Creating an Area3D
  • Adding a Shape
  • Setting a Shape
  • Removing a Shape
  • Changing a Shape
  • Changing a RigidBody's size

Since each of these changes could be made before needing the shapes in a single user _physics_process() loop, there is no way to add sufficient calls to do_reload_shape() without effectively calling do_reload_shape() when calling reload_shapes(), which is exactly what this PR was attempting to avoid.

Furthermore, the test used to demonstrate the effectiveness of this PR is misleading, because it doesn't include the time for subsequently calling do_reload_shapes() on all the RigidBodies added. When including this time, the performance gains are eliminated.

Test project: 39726.zip

Using this test project with 1,000 RigidBodies and Shapes on my laptop I get these total times (i.e. time since _ready() was called):

Godot Physics:
Frame 1: 3.470007
Frame 2: 36.684109
Frame 3: 36.751202
Frame 4: 36.815403
Frame 5: 36.876218
Frame 6: 36.935695
Frame 7: 36.99797
Frame 8: 37.061041
Frame 9: 37.289435
Frame 10: 37.353653

Bullet Physics:
Frame 1: 0.096981
Frame 2: 17.908696
Frame 3: 30.998907
Frame 4: 35.428883
Frame 5: 38.835653
Frame 6: 40.655049
Frame 7: 41.806311
Frame 8: 42.736726
Frame 9: 43.267167
Frame 10: 43.71381

In other words, not only does this PR cause numerous crashes that may never be resolvable, it doesn't achieve the performance gain it set out to achieve; the slow Bullet Frame Rate is still there.

Therefore, I think this PR should be reverted.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the code used for the test:

extends Node

var count = 0

var shapes_per_frame := 1000
var shape := BoxShape3D.new()
var t1

func _ready():
	t1 = OS.get_unix_time()
	_do_test()

func _physics_process(delta):
	if count < 10:
		var t2 = OS.get_unix_time()
		count += 1
		print("Frame ", count, ": ", t2 - t1)

func _do_test():
	print("Test start")
	for i in range(0, shapes_per_frame):
		var rb = RigidBody3D.new()
		var c = CollisionShape3D.new()
		c.set_shape(shape)
		rb.add_child(c)
		add_child(rb)
		if i % 50 == 0:
			print("Added: ", i)

To be fair, I think that your test doesn't prove what you claim.

It adds 1000 RigibBody into the scene, into the _ready function. So to know how much time it taken to spawn those RigidBodies - you just need to check the first ever frame, that according to your measurements is slower in Godot Physics: Frame 1: 3.470007 vs Frame 1: 0.096981. All other frame measurements are useless to prove spawn time and so the effectiveness of this PR is proven.

However, is important to say that this PR want to optimize the performance relative to Bullet itself and not relative to GodotPhysics which was running already well.

I don't think that the introduced bugs are a good reason to revert the optimization done instead of fixing the issues. It's like say that the Vulkan implementation is bad because is half broken, or the GDScript v2 is bad because it's not yet fully stable: and so both should be reverted.

The mentioned bug, especially #40840, have a solution: #40886 (comment) please let me know if you want to do that change, if no I can provide a new PR to fix that.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not claiming anything. I'm refuting the claims made to support this PR; specifically that it solves #30027. I'm also highlighting why it has introduced numerous crashing issues, and suggesting that a PR that causes more problems than it solves should be reverted.

My test, which is a modified version of the original test provided to support this PR, shows that only looking at the time passed at the start of the first call to _physics_process() is misleading. It is misleading, because, at the time of the first call to _physics_process(), the calls to do_reload_shapes() have not occurred. They are called when the physics server's step() function is called, which happens after _physics_process().

However, more importantly, this PR does not solve issue #30027. The PR reduces the number of times that the function that reloads the shapes is called, but these calls are not the source of the performance problem. The primary source of the performance problem is in the step() function; more specifically, calls to btDiscreteDynamicsWorld::stepSimulation() i.e. within Bullet physics.

While the actual gains of this PR are minimal, the issues caused by this change are significant. The more I investigate the crashing issues caused, the more I'm convinced they will not be solved with multiple band-aid fixes like #40886 (comment) without ultimately reverting this change, or worse: actually increasing the number of calls to do_reload_shapes().

In summary, I believe this PR should be reverted, because it causes multiple crashing issues while doing little to improve actual performance.

Copy link
Contributor Author

@AndreaCatania AndreaCatania Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check this: #41082 (comment)
This PR fixes the remaining issues: #41096

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bullet] Adding a high amount of physics bodies is significantly slower with Bullet compared to GodotPhysics
6 participants