Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize / refactor CowData, combining resize and fork to avoid unnecessary reallocations. #100619

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

Ivorforce
Copy link
Contributor

@Ivorforce Ivorforce commented Dec 19, 2024

This PR optimizes CowData resize operations. This affects all non-trivial resizes for String and Vector (e.g. Packed arrays), including concatenation.

This PR includes #100694, because without it, leaks are occurring due to cyclic references that the previous implementation luckily avoided through unnecessary forking.
This PR also supersedes #100672.

Reasoning

The current implementation of resize first forks if there are multiple owners, and then resizes the freshly forked array.
In the case that the needed array is larger than the previous array, time is wasted because a new location may have to be found for the larger buffer on the subsequent realloc, and the data has to be copied over again.
In the case that the needed array is smaller than the previous array, time is wasted looking for a larger buffer than necessary, and constructing (and subsequently destructing) more elements than necessary.

Benchmark

I've benchmarked plain String concatenation:

	String s1 = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";
	String s2 = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat. Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.";

	auto t0 = std::chrono::high_resolution_clock::now();
	for (int i = 0; i < 10000000; i ++) {
		// Test
		String s3 = s1 + s2;
	}
	auto t1 = std::chrono::high_resolution_clock::now();
	std::cout << std::chrono::duration_cast<std::chrono::milliseconds>(t1 - t0).count() << "ms\n";

This yields:

As such, it can be concluded that String concatenation through + can be 1.6x as fast as before. The PR is likely to affect and optimize many more functions involving String or Vector.

@Ivorforce Ivorforce requested a review from a team as a code owner December 19, 2024 22:06
@Ivorforce Ivorforce force-pushed the cowdata-resize-direct branch 3 times, most recently from 749d0a5 to 3a4c1f9 Compare December 19, 2024 23:21
@Ivorforce
Copy link
Contributor Author

Ivorforce commented Dec 20, 2024

Hrm. I'm kinda stumped as to what's leaking. I think it's foolproof, in any case foolproof if the CowData itself is not accessed from other threads (though access of refcount should be safe cross-thread). And besides, the previous implementation was also not safe from forking a 1-ref'd CowData from another thread, I think (through realloc). Either I'm missing some odd edge case or race condition, or something fishy is going on.

@Chaosus Chaosus added this to the 4.x milestone Dec 20, 2024
@Ivorforce Ivorforce force-pushed the cowdata-resize-direct branch 2 times, most recently from c45c762 to d69cb3b Compare December 20, 2024 20:42
@Ivorforce Ivorforce requested review from a team as code owners December 20, 2024 20:42
@Ivorforce Ivorforce force-pushed the cowdata-resize-direct branch from d69cb3b to 84b402c Compare December 20, 2024 20:51
@Ivorforce
Copy link
Contributor Author

Oh my god... This leak is costing me and arm and a leg to debug. Here is a failing log for reference: https://github.com/godotengine/godot/actions/runs/12435300159/job/34720869143

I have finally tracked it down to the GDScript tests only. Normal operation does not appear to cause leaks. At this point, I am reasonably sure it's not my implementation that's causing it, but some kind of circular reference. Though it's hard to be completely certain.
If it is a circular reference, the current implementation of CowData appears to 'avoid' it somehow, but it's really hard to track down and I have, so far, not been able to re-create the bug / fix in my refactor by attempting to recreate CowData behavior in that class.

Running the unit tests locally without out_of_order_external.gd avoids the leak. Online, I had to disable all GDScript unit tests, because more than just one were causing the leak. But disabling them does avoid any leaks from appearing: https://github.com/Ivorforce/godot/actions/runs/12438440843

@hpvb
Copy link
Member

hpvb commented Dec 20, 2024

image

I'll look at this PR over the weekend. I have a suspicion but it'll take some single stepping, and Friday night is not the time for single stepping through CowData.

@Ivorforce
Copy link
Contributor Author

I have an update. Turns out one of the 'preconditions' I was imagining to be true was false all along - refcount can be 0. Who knew.

This is true on master as well, as this fails unit tests: master...Ivorforce:godot:cowdata-refcount-0

After meticulously re-creating my PR step by step, I'm pretty sure that's what's causing the issues. I don't know how it's possible that refcount is 0 at any time. At least in this PR, it's initialized with 1 and decremented and incremented in very controlled ways, and I recently checked nobody else is interfering. This is a really strange one...

@Ivorforce
Copy link
Contributor Author

Ivorforce commented Dec 21, 2024

I think I finally fixed it.

So, GDScript does have cyclic references, and it calls a Vector destructor, and then halfway through the destruction, the array is referenced again through some pointer. We should probably figure out that cyclic reference, because it really smells. But it's out of scope for me to address it (for posterity, involved functions are logged here - one is constants in GDScriptFunction).

Instead, I managed to handle it more graciously through #100694. With the change, CowData invalidates itself before destructing the array, and graciously handles being referenced, reallocated, or re-deallocated halfway through its own destruction. We'll see if the remote leak sanitizers agree, but this fixed the leak on my end.

@Ivorforce Ivorforce force-pushed the cowdata-resize-direct branch from 69c3f06 to 2617af3 Compare December 25, 2024 10:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants