Optimize string single char contains calls. #100024

Ivorforce · 2024-12-04T21:02:55Z

Here's why it's important to optimize single-char calls:

contains is easier to compute for single-char strings, because no nested loops are needed. It may even be SIMD vectorized with a good and / or compiler.
There are a lot of callers to these functions. Every one of these benefits a bit from this optimization, lubricating the gears.

For reviewers

The PR is more simple than it seems.
Almost all changes are search and replace with regex and should be correct, like:

find contains\("(\\?.)"", replace with contains_char('$1')

The exception is ustring.cpp (and ustring.h), where the optimization is being dispatched.

Old Version

Parts of this PR were duplicate / alternative solutions to #92475. I have since consolidated this one to only feature the optimized contains calls. single-char replace calls will be discussed and finalized in #92475.

Old Description

Here's why it's important to optimize single-char calls:

both replace and contains are much easier to compute for single-char strings, because no nested loops are needed. It may even be SIMD vectorized with a good and / or compiler.
There are a lot of callers to these functions. Every one of these benefits a bit from this optimization, lubricating the gears.
replace with string arguments copies the data even if there are no other CowData co-owners. It also iterates the array twice. This can be exploited by ustring.cpp (although not from elsewhere, because string.replace makes a copy anyway).

This PR depends on (and includes) #100015.

I don't have a benchmark yet, but I believe the single-char replace and contains functions should be several times faster than the full string ones.

For reviewers

The PR is more simple than it seems.
Almost all changes are search and replace with regex and should be correct, like:

find replace\("(\\?.)", "(\\?.)", replace with replace('$1', '$2')
same for contains, similar for find

The exceptions are:

cowdata.h, where replace is implemented.
ustring.cpp where some algorithms are optimized to make better use of the function.

Ivorforce · 2024-12-04T21:26:56Z

I have made a simple benchmark (with auto lane-switching to the single-char string function disabled temporarily):

	String s = "Who is Frederic?Who is Frederic?Who is Frederic?";
	auto t0 = std::chrono::high_resolution_clock::now();
	for (int i = 0; i < 1000; i ++) {
		String s1 = s.replace("e", "b");
	}
	auto t1 = std::chrono::high_resolution_clock::now();
	std::cout << std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count() << "us\n";
	t0 = std::chrono::high_resolution_clock::now();
	for (int i = 0; i < 1000; i ++) {
		String s1 = s.replace('e', 'b');
	}
	t1 = std::chrono::high_resolution_clock::now();
	std::cout << std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count() << "us\n";
	t0 = std::chrono::high_resolution_clock::now();
	for (int i = 0; i < 1000; i ++) {
		String s1 = s.replace("#", "b");
	}
	t1 = std::chrono::high_resolution_clock::now();
	std::cout << std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count() << "us\n";
	t0 = std::chrono::high_resolution_clock::now();
	for (int i = 0; i < 1000; i ++) {
		String s1 = s.replace('#', 'b');
	}
	t1 = std::chrono::high_resolution_clock::now();
	std::cout << std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count() << "us\n";

This prints

253us
84us
24us
18us

So, just very roughly, the single-char implementation can be:

About 3x as fast when some characters are replaced (i.e. a copy is made).
About 1.3x as fast when no characters are replaced (i.e. no copy is made).

contains

A similar test for contains:

	String s = "Who is Frederic?Who is Frederic?Who is Frederic?";
	auto t0 = std::chrono::high_resolution_clock::now();
	for (int i = 0; i < 100000; i ++) {
		int x = s.contains("e");
	}
	auto t1 = std::chrono::high_resolution_clock::now();
	std::cout << std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count() << "us\n";
	t0 = std::chrono::high_resolution_clock::now();
	for (int i = 0; i < 100000; i ++) {
		int x = s.contains('e');
	}
	t1 = std::chrono::high_resolution_clock::now();
	std::cout << std::chrono::duration_cast<std::chrono::microseconds>(t1 - t0).count() << "us\n";

prints

622us
501us

So, it also benefits from single-char optimization, though not as much (1.24x). But at the same time, its optimization was already implemented, so I'm only contributing by relaying a few calls to it instead of the char * based variant.

AThousandShips

Looks good to me

core/string/ustring.cpp

hpvb

I'd much rather see a String::contains(const char_t), or equivalent overload. So future developers can't make the mistake. It also will make the change local to ustring.cpp/h, or a (const char[2]) overload, I guess. perhaps both? :)

Ivorforce · 2024-12-07T21:16:04Z

I'd much rather see a String::contains(const char_t), or equivalent overload. So future developers can't make the mistake.

I had it as an String::contains overload first, but I was requested above to rename it to contains_char because similar single-char functions are appended with _char. I believe the main reason is that it is easier to bind to scripting languages this way?

It also will make the change local to ustring.cpp/h

How would that be possible? Callers that currently call with contains(char *) will have to be adjusted to use the single-char function with either the contains name and the contains_char name. The only way to make it consistently select the single-char implementation at compile time would be to expose the strlen(x) == 1 check in the header file, which I'm not sure if it is a good idea to do for every single function. Or idk, maybe it is - it would bloat the header file somewhat, requiring inlining by String callers, but it's a fairly simple check.

, or a (const char[2]) overload, I guess. perhaps both?

I experimented with this earlier, but this logic would fail for contains({ 'a', 'b' }) which is not appended with NULL. Yeah, it's a bit annoying.

hpvb · 2024-12-07T21:28:45Z

I'd much rather see a String::contains(const char_t), or equivalent overload. So future developers can't make the mistake.

I had it as an String::contains overload first, but I was requested above to rename it to contains_char because similar single-char functions are appended with _char. I believe the main reason is that it is easier to bind to scripting languages this way?

The speed improvements of contains_char would have to be massive for it to matter in gdscript, but perhaps I'm being too pessimistic :P

It also will make the change local to ustring.cpp/h

How would that be possible? Callers that currently call with contains(char *) will have to be adjusted to use the single-char function with either the contains name and the contains_char name. The only way to make it consistently select the single-char implementation at compile time would be to expose the strlen(x) == 1 check in the header file, which I'm not sure if it is a good idea to do for every single function.

If we do a const char[2] overload, we can just assume size=1, right?

, or a (const char[2]) overload, I guess. perhaps both?

I experimented with this earlier, but this logic would fail for contains({ 'a', 'b' }) which is not appended with NULL. Yeah, it's a bit annoying.

Wouldn't that fail now also? If there's no null it'd just crash now, no? as in, that is already an error?

hpvb

If we can't get it merged with an overload lets just keep it like this. Maybe we can do a pass over all of the _char functions and just remove them in one fell swoop eventually.

Ivorforce · 2024-12-07T21:48:50Z

The speed improvements of contains_char would have to be massive for it to matter in gdscript, but perhaps I'm being too pessimistic :P

GDScript may not benefit, but C# and GDExtension also use the scripting interface (though I also don't know what that has to do with the c++ exposed name).

If we do a const char[2] overload, we can just assume size=1, right?

, or a (const char[2]) overload, I guess. perhaps both?

I experimented with this earlier, but this logic would fail for contains({ 'a', 'b' }) which is not appended with NULL. Yeah, it's a bit annoying.

Wouldn't that fail now also? If there's no null it'd just crash now, no? as in, that is already an error?

This turns out to be surprisingly difficult to answer, lol.

Here's a test I just ran to answer this question:

static const char s[] = { 'a', 'b' };

template <size_t l>
size_t test(const char (&s)[l]) {
	return l;
}

__attribute__((noinline)) size_t test2(const char *s) {
	return strlen(s);
}

int main()
{
	std::cout << strlen(s) << std::endl;
	std::cout << test(s) << std::endl;
	std::cout << test2(s) << std::endl;
}

It prints

2
2
24

That's right, clang cheats with its strlen implementation, assuming a NULL terminator at compile time when it should probably complain instead.

In any case, calling String.contains({ 'a', 'b' }) would yield unexpected behavior. If evaluated at compile time, it would return String.contains(a). If evaluated at runtime, it might even SIGSEV. I guess it would be fair to document the behavior of this function. This is one of the reasons I used strlen instead of len in #99806 even for the arrays known at compile time - it failed unit tests otherwise.

Ivorforce · 2024-12-07T21:51:27Z

If we can't get it merged with an overload lets just keep it like this. Maybe we can do a pass over all of the _char functions and just remove them in one fell swoop eventually.

That's fair; for the record, I also think a contains overload without a _char suffix is more appropriate. But it's something we could address later, it's not a dealbreaker.

hpvb · 2024-12-07T21:59:24Z

@Ivorforce surely calling strlen() on a char array without a null terminator is already UB. I think you're just digging into how surprising the UB is. :P strlen({'a', 'b'}); might kill your cat and you don't get to complain about it.

kiroxas

Straightforward one. Nice !

Repiteo

I'm not at all convinced that the _char append is actually necessary, but I'm not interested in bikeshedding this PR when there's already obvious performance gains. I'll just echo hpvb's sentiments on eventually doing a pass which removes them entirely & call it a day

Repiteo · 2024-12-09T20:36:54Z

Thanks!

AThousandShips · 2024-12-17T23:05:10Z

modules/text_server_adv/thorvg_svg_in_ot.cpp

@@ -97,7 +97,7 @@ FT_Error tvg_svg_in_ot_preset_slot(FT_GlyphSlot p_slot, FT_Bool p_cache, FT_Poin
 			if (parser->has_attribute("id")) {
 				const String &gl_name = parser->get_named_attribute_value("id");
 				if (gl_name.begins_with("glyph")) {
-					int dot_pos = gl_name.find(".");
+					int dot_pos = gl_name.find_char('.');


Missed that you made these changes, will make a PR fixing this, these are not valid because this is an unexposed method, see:

Use (r)find_char instead of (r)find for single characters #99328

Oops, sorry and thanks!

Should have noticed as I did my own sweep with these before so it's on me!

Ivorforce requested review from a team as code owners December 4, 2024 21:02

Ivorforce force-pushed the optimize-string-single-char branch from 6a3f79f to 5f5c93c Compare December 4, 2024 21:06

Ivorforce mentioned this pull request Dec 4, 2024

Use (r)find_char instead of (r)find for single characters #99328

Merged

Repiteo added the enhancement label Dec 4, 2024

Repiteo requested review from a team and removed request for a team December 6, 2024 17:33

AThousandShips approved these changes Dec 6, 2024

View reviewed changes

core/string/ustring.cpp Outdated Show resolved Hide resolved

core/string/ustring.cpp Outdated Show resolved Hide resolved

Add contains_char() for single-character 'contains' calls.

b5c31eb

Ivorforce force-pushed the optimize-string-single-char branch from c3ddfd4 to b5c31eb Compare December 6, 2024 19:23

adamscott modified the milestones: 4.x, 4.4 Dec 6, 2024

hpvb requested changes Dec 7, 2024

View reviewed changes

hpvb approved these changes Dec 7, 2024

View reviewed changes

kiroxas approved these changes Dec 8, 2024

View reviewed changes

Repiteo approved these changes Dec 9, 2024

View reviewed changes

Repiteo merged commit a607bca into godotengine:master Dec 9, 2024
20 checks passed

Ivorforce deleted the optimize-string-single-char branch December 9, 2024 20:38

AThousandShips reviewed Dec 17, 2024

View reviewed changes

AThousandShips mentioned this pull request Dec 18, 2024

[TextServer] Fix use of find_char in text servers #100562

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize string single char contains calls. #100024

Optimize string single char contains calls. #100024

Ivorforce commented Dec 4, 2024 •

edited

Loading

Ivorforce commented Dec 4, 2024 •

edited

Loading

AThousandShips left a comment

hpvb left a comment •

edited

Loading

Ivorforce commented Dec 7, 2024 •

edited

Loading

hpvb commented Dec 7, 2024

hpvb left a comment

Ivorforce commented Dec 7, 2024 •

edited

Loading

Ivorforce commented Dec 7, 2024

hpvb commented Dec 7, 2024

kiroxas left a comment

Repiteo left a comment

Repiteo commented Dec 9, 2024

AThousandShips Dec 17, 2024

Ivorforce Dec 17, 2024

AThousandShips Dec 17, 2024

Optimize string single char contains calls. #100024

Optimize string single char contains calls. #100024

Conversation

Ivorforce commented Dec 4, 2024 • edited Loading

For reviewers

Old Version

For reviewers

Ivorforce commented Dec 4, 2024 • edited Loading

contains

AThousandShips left a comment

Choose a reason for hiding this comment

hpvb left a comment • edited Loading

Choose a reason for hiding this comment

Ivorforce commented Dec 7, 2024 • edited Loading

hpvb commented Dec 7, 2024

hpvb left a comment

Choose a reason for hiding this comment

Ivorforce commented Dec 7, 2024 • edited Loading

Ivorforce commented Dec 7, 2024

hpvb commented Dec 7, 2024

kiroxas left a comment

Choose a reason for hiding this comment

Repiteo left a comment

Choose a reason for hiding this comment

Repiteo commented Dec 9, 2024

AThousandShips Dec 17, 2024

Choose a reason for hiding this comment

Ivorforce Dec 17, 2024

Choose a reason for hiding this comment

AThousandShips Dec 17, 2024

Choose a reason for hiding this comment

Ivorforce commented Dec 4, 2024 •

edited

Loading

Ivorforce commented Dec 4, 2024 •

edited

Loading

hpvb left a comment •

edited

Loading

Ivorforce commented Dec 7, 2024 •

edited

Loading

Ivorforce commented Dec 7, 2024 •

edited

Loading