-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SIMDify ToLowerInvariant/ToUpperInvariant #78262
Conversation
I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label. |
src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Text/Unicode/Utf16Utility.cs
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs
Outdated
Show resolved
Hide resolved
43b97a4
to
be363c0
Compare
What's the overhead when non-ASCII is encountered? |
…I chars as we can before we switch to ICU/NLS
The worst case for this algorithm is a short (>7 chars) full non-ASCII string, but the cost of SIMD "is ascii" check is not too big. public IEnumerable<string> TestData()
{
// worst case: short full non-ASCII
yield return "Привет Мир";
yield return "ASCII-string with non-ASCII chars: ыц!";
}
private static readonly char[] OutputBuffer = new char[1024];
[Benchmark]
[ArgumentsSource(nameof(TestData))]
public void ToLowerInvariant(string str) => str.AsSpan().ToLowerInvariant(OutputBuffer.AsSpan());
I've pushed a change to call the Scalar path if we encounter non-ASCII in a vector - because we still want to process as many ASCII chars as we can before we switch to extremely slow NLS/ICU fallback. |
This PR does:
ToUpperInvariant()
shows the same numbers.