Optimize `Node.find_children` by 24% #112896

Joy-less · 2025-11-17T23:48:52Z

I optimized the performance of recursive find_children by 24%. Currently, it creates a new array for each descendant, only to append each descendant to the original array. This pull request makes it use the same array, using a lambda function.

Benchmark:

extends Node3D

func _ready()->void:
	while true:
		bench_old()
		bench_new()
		await get_tree().process_frame

func bench_old()->void:
	var start:float = Time.get_unix_time_from_system()
	var results:Array[Node]
	for i:int in 100_000:
		results = find_children_old("*")
	var end:float = Time.get_unix_time_from_system()
	print("old: ", (end - start) * 1000, "ms")
	results.clear()

func bench_new()->void:
	var start:float = Time.get_unix_time_from_system()
	var results:Array[Node]
	for i:int in 100_000:
		results = find_children("*")
	var end:float = Time.get_unix_time_from_system()
	print("new: ", (end - start) * 1000, "ms")
	results.clear()

Result:

old: 839.999914169312ms
new: 636.000156402588ms
old: 832.000017166138ms
new: 635.999917984009ms
old: 833.000183105469ms
new: 638.000011444092ms

This means each benchmark call used to take 0.0084ms but now takes 0.00636ms.

The benchmarks were run in-editor with 41 descendants that I created and scattered randomly to emulate a real scene tree.

Ivorforce

Please note that using lambdas in Godot code is discouraged, as per our guidelines.

It should be possible to make this function completely iterative (for example, by using a single LocalVector<Node *> todo). I'm not sure how this would compare performance wise to the lambda, but I expect it to be faster than the original implementation. A comparison would be nice.

Joy-less · 2025-11-18T02:47:31Z

@Ivorforce I attempted to replace it with an iterative solution, but that actually turned out to be slightly slower than my recursive lambda solution (65ms -> 69ms). I'm not an expert in C++ unfortunately, so I'm not sure why it would be slower.

Iterative solution:

TypedArray<Node> Node::find_children(const String &p_pattern, const String &p_type, bool p_recursive, bool p_owned) const {
	ERR_THREAD_GUARD_V(TypedArray<Node>());

	TypedArray<Node> ret;
	ERR_FAIL_COND_V(p_pattern.is_empty() && p_type.is_empty(), ret);

	LocalVector<Pair<const Node *, uint32_t>> to_visit;
	to_visit.push_back(Pair<const Node *, uint32_t>(this, 0u));

	while (!to_visit.is_empty()) {
		Pair<const Node *, uint32_t> &check = to_visit[to_visit.size() - 1];
		const Node *current_node = check.first;
		uint32_t &child_index = check.second;

		if (child_index == 0) {
			current_node->_update_children_cache();
		}

		Node *const *child_ptr = current_node->data.children_cache.ptr();
		uint32_t child_count = current_node->data.children_cache.size();

		bool pushed_child = false;

		while (child_index < child_count) {
			Node *child = child_ptr[child_index];
			child_index++;

			if (p_owned && !child->data.owner) {
				continue;
			}

			if (p_pattern.is_empty() || child->data.name.operator String().match(p_pattern)) {
				if (p_type.is_empty() || child->is_class(p_type)) {
					ret.append(child);
				} else if (child->get_script_instance()) {
					Ref<Script> scr = child->get_script_instance()->get_script();
					while (scr.is_valid()) {
						if ((ScriptServer::is_global_class(p_type) && ScriptServer::get_global_class_path(p_type) == scr->get_path()) || p_type == scr->get_path()) {
							ret.append(child);
							break;
						}

						scr = scr->get_base_script();
					}
				}
			}

			if (p_recursive) {
				to_visit.push_back(Pair<const Node *, uint32_t>(child, 0u));
				pushed_child = true;
				break;
			}
		}

		if (!pushed_child) {
			to_visit.resize(to_visit.size() - 1);
		}
	}

	return ret;
}

Ivorforce · 2025-11-18T09:55:00Z

Ok, the regression is probably overhead from to_visit.

I've got another idea, one that doesn't involve to_visit. Instead, you could use the 'walk' pattern, by using the node.get_index() function.

Quick explainer:
Essentially, you set a current_node to this. On each iteration step, you add the current node to the ret (if it fits), and then set current_node to its first child. When there are no children, find the next sibling likeparent.children_cache[current_node to get_index() + 1]. If there is no next sibling (index == parent.children_cache.size()), walk to the next sibling of the parent instead (and repeat the test). If current_node reaches back to this, return ret.

I can't guarantee this would be noticeably faster than your lambda solution (needs another test), but it's worth a try. Let me know if the proposed solution makes sense to you.

Joy-less · 2025-11-18T14:57:22Z

Ok, the regression is probably overhead from to_visit.

I've got another idea, one that doesn't involve to_visit. Instead, you could use the 'walk' pattern, by using the node.get_index() function.

Quick explainer: Essentially, you set a current_node to this. On each iteration step, you add the current node to the ret (if it fits), and then set current_node to its first child. When there are no children, find the next sibling likeparent.children_cache[current_node to get_index() + 1]. If there is no next sibling (index == parent.children_cache.size()), walk to the next sibling of the parent instead (and repeat the test). If current_node reaches back to this, return ret.

I can't guarantee this would be noticeably faster than your lambda solution (needs another test), but it's worth a try. Let me know if the proposed solution makes sense to you.

Thank you, I was able to get identical (if not slightly better) performance using the new iterative approach.

first: 164.000034332275ms // original approach
second: 64.000129699707ms // lambda approach
third: 62.999963760376ms // new iterative approach

scene/main/node.cpp

Ivorforce · 2025-11-19T21:14:25Z

scene/main/node.cpp

+					if (current_node->data.index + 1 < (int)siblings.size()) {
+						// Go to next sibling
+						current_node = siblings[current_node->data.index + 1];
+						break;


Suggested change

break;

continue;

I don't understand this suggestion.

scene/main/node.cpp

Co-authored-by: Lukas Tenbrink <[email protected]>

Joy-less · 2025-11-19T21:55:08Z

@Ivorforce Sorry, your suggestions became hard to follow with the commits so I marked them as resolved. Please re-add the suggestions if there are more issues.

Ivorforce · 2025-11-19T22:06:56Z

@Joy-less Please re-open the comments, I don't want to review the same code again with the same comments.
Viewing them on the "Conversation" page shows the old context of when the comments were written, which is useful especially for comments on "Outdated" code. You can also look at the revision the code was at when I submitted my review (63dd5af) for additional context.

Joy-less · 2025-11-19T23:57:29Z

@Ivorforce I've now tested both non-recursive and recursive and they work fine with the new commits.

Joy-less requested a review from a team as a code owner November 17, 2025 23:48

Joy-less force-pushed the optimize-find_children branch from c4fe3d9 to 61d470b Compare November 17, 2025 23:53

Optimize Node.find_children

b6d30a4

Joy-less force-pushed the optimize-find_children branch from 61d470b to b6d30a4 Compare November 18, 2025 00:11

Ivorforce reviewed Nov 18, 2025

View reviewed changes

Fix update children cache not called

c3d3188

Mickeon added enhancement topic:core performance labels Nov 18, 2025

Mickeon added this to the 4.6 milestone Nov 18, 2025

Mickeon requested a review from Ivorforce November 18, 2025 09:39

Use iterative approach

351ea12

Make static checks happy

1d28573

Joy-less force-pushed the optimize-find_children branch from 35cfb9a to 1d28573 Compare November 18, 2025 15:01

Fix comparison error

63dd5af

Repiteo modified the milestones: 4.6, 4.x Nov 19, 2025

Ivorforce reviewed Nov 19, 2025

View reviewed changes

Joy-less and others added 6 commits November 19, 2025 21:44

Use early return

49a22ac

Co-authored-by: Lukas Tenbrink <[email protected]>

Fix early return issues

9fe3c97

Use more early returns

3743f4d

Add comment explaining tree walk

f684ab3

Update comment

120ccea

Remove unnecessary else block

dbafafc

Fix non-recursive loop

75b24d0

Joy-less added 3 commits November 19, 2025 22:56

Do parent check at up-walked position

a521eee

Move condition to body

75f644a

Fix owner check

921ceb8

Uh oh!

Optimize Node.find_children by 24% #112896

Are you sure you want to change the base?

Optimize Node.find_children by 24% #112896

Conversation

Joy-less commented Nov 17, 2025

Uh oh!

Ivorforce left a comment

Choose a reason for hiding this comment

Uh oh!

Joy-less commented Nov 18, 2025

Uh oh!

Ivorforce commented Nov 18, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Joy-less commented Nov 18, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Ivorforce Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

Joy-less Nov 19, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Joy-less commented Nov 19, 2025

Uh oh!

Ivorforce commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Joy-less commented Nov 19, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Optimize `Node.find_children` by 24% #112896

Optimize `Node.find_children` by 24% #112896

Ivorforce commented Nov 18, 2025 •

edited

Loading

Ivorforce commented Nov 19, 2025 •

edited

Loading