Optimize animation performance#110474
Conversation
6fb05e0 to
e6e5484
Compare
|
What kind of hardware did you use for those performance tests? Removing pretty much exactly 1 ms is very nice! I wonder how it scales on lower or higher end hardware. |
I tested it on my Surface Laptop Studio 2 with an i7-13700H |
e6e5484 to
677d862
Compare
| @@ -338,6 +338,23 @@ class AHashMap { | |||
| return nullptr; | |||
| } | |||
|
|
|||
| TValue &get_value_ref_or_add_default(const TKey &p_key, bool &r_was_added) { | |||
There was a problem hiding this comment.
I think the function you're looking for already exists (operator[]). Unless you really do need r_was_added? But I don't see you use this function at all. Did you add it by mistake?
There was a problem hiding this comment.
I need r_was_added here
There was a problem hiding this comment.
Please don't resolve comments prematurely.
Thanks for the links, for some reason GitHub didn't show me those on search. Still, i don't like duplicating methods. Figuring this out can wait until after this comes out of draft again though.
There was a problem hiding this comment.
You can use find + insert_new instead
Here you can:
AHashMap<StringName, Pair<Variant, bool>> &property_map = process_state->tree->property_map;
auto iter = property_cache.find(p_name);
if(!iter) {
iter = property_cache.insert_new(p_name, 0);
} else {
return property_map.get_by_index(iter->value).value.first;
}
There was a problem hiding this comment.
You can use
find + insert_newinsteadFor example
Here you can:
AHashMap<StringName, Pair<Variant, bool>> &property_map = process_state->tree->property_map; auto iter = property_cache.find(p_name); if(!iter) { iter = property_cache.insert_new(p_name, 0); } else { return property_map.get_by_index(iter->value).value.first; }
Problem is using find then insert_new hashes the key twice, while get_value_ref_or_add_default hashes it once.
|
Probably there's a lot of overlap with #101285. So I think there is worth for you discuss with @Nazarwadim. |
This PR will then have to wait until #101285 is merged in |
1bb25d9 to
d2d96fb
Compare
d2d96fb to
834f8fb
Compare
There was a problem hiding this comment.
Changing String->StringName and modifying the importer should generally be fine.
However, this PR's changes to AnimationTree completely break the behavior for nested AnimationNodeBlendTree instances. When I looked at the code, I noticed that the way AnimationNode instances are accessed has changed and since that concerns me, I tested it.
Before:
nested_atree.mp4
After this PR:
nested_atree_after.mp4
Therefore, I recommend removing the AnimationTree's change to how AnimationNode instances are accessed from this PR for now. And if you really want to implement it, it would be better to create a separate PR for that.
|
I'm going to split this into many smaller PRs to make it more manageable. |
|
I changed my mind |
Before this patch (2dd26a0).
FPS: high 192, low 185, median 188.5
MSPF: high 5.40, low 5.20, median 5.30
After this patch.(Benchmarks no longer accurate)FPS: high 234, low 229, median 231MSPF: high 4.36, low 4.27, median 4.32Here is the test project I used, I ran it from the command line with
--headless --print-fps, and the engine was compiled withscons compiledb=yes debug_symbols=yes production=yesMrp.zip
Changes:
List<>usages withLocalVector<>.&, to avoid copies.const &(to avoid copies and ref counting overhead)has_node()+get_node()changed toget_node_or_null()), same for HashMaps.SNAMEwhen needed.AnimationNodeBlendTreeEditornow queues (and merges)update_graphcalls at the end of the frame.AnimationMixer::PlaybackInfoare now passed byconst &instead of copied.AnimationNode::_blend_nodenow uses preallocatedStringNames.AnimationTreeis now safe to use withNode::process_thread_group