SIMDify ToLowerInvariant/ToUpperInvariant #78262

EgorBo · 2022-11-12T13:20:07Z

This PR does:

Adds more tests
SIMDify large inputs for ToLowerInvariant/ToUpperInvariant (for spans)

public IEnumerable<string> TestData()
{
    yield return "Ab"; // 2 chars: no SIMD path
    yield return "Abcd"; // 4 chars: no SIMD path
    yield return "Abcd123"; // 7 chars: no SIMD path
    yield return "Abcd1234"; // 8 chars: 1xV128
    yield return "Abcd1234Ab"; // 10 chars: 1xV128 + 2 trailing chars
    yield return "Abcd1234Abcd123"; // 15 chars: 1xV128 + 7 trailing chars
    yield return "Licensed to the NET Foundation"; // 32 chars
    yield return "We always welcome bug reports, API proposals and overall feedback. Here are a few tips on how you can make reporting your issue as effective as possible.";
}

private static readonly char[] OutputBuffer = new char[1024];

[Benchmark]
[ArgumentsSource(nameof(TestData))]
public void ToLowerInvariant(string str) => str.AsSpan().ToLowerInvariant(OutputBuffer.AsSpan());

Method	Toolchain	str	Mean	Ratio
ToLowerInvariant	\Core_Root\corerun.exe	Ab	6.924 ns	1.03
ToLowerInvariant	\Core_Root_base\corerun.exe	Ab	6.778 ns	1.00

ToLowerInvariant	\Core_Root\corerun.exe	Abcd	7.623 ns	1.03
ToLowerInvariant	\Core_Root_base\corerun.exe	Abcd	7.379 ns	1.00

ToLowerInvariant	\Core_Root\corerun.exe	Abcd123	8.461 ns	0.97
ToLowerInvariant	\Core_Root_base\corerun.exe	Abcd123	8.683 ns	1.00

ToLowerInvariant	\Core_Root\corerun.exe	Abcd1234	6.275 ns	0.73
ToLowerInvariant	\Core_Root_base\corerun.exe	Abcd1234	8.580 ns	1.00

ToLowerInvariant	\Core_Root\corerun.exe	Abcd1234Ab	6.858 ns	0.75
ToLowerInvariant	\Core_Root_base\corerun.exe	Abcd1234Ab	9.165 ns	1.00

ToLowerInvariant	\Core_Root\corerun.exe	Abcd1234Abcd123	7.001 ns	0.64
ToLowerInvariant	\Core_Root_base\corerun.exe	Abcd1234Abcd123	10.871 ns	1.00

ToLowerInvariant	\Core_Root\corerun.exe	Licen(...)ation [30]	9.250 ns	0.61
ToLowerInvariant	\Core_Root_base\corerun.exe	Licen(...)ation [30]	15.027 ns	1.00

ToLowerInvariant	\Core_Root\corerun.exe	We a(...)ble. [153]	20.832 ns	0.42
ToLowerInvariant	\Core_Root_base\corerun.exe	We a(...)ble. [153]	49.970 ns	1.00

ToUpperInvariant() shows the same numbers.

dotnet-issue-labeler · 2022-11-12T13:20:12Z

I couldn't figure out the best area label to add to this PR. If you have write-permissions please help me learn by adding exactly one area label.

src/libraries/Common/tests/Tests/System/StringTests.cs

src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs

src/libraries/System.Private.CoreLib/src/System/Text/Unicode/Utf16Utility.cs

src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs

stephentoub · 2022-11-12T22:55:30Z

numbers.

What's the overhead when non-ASCII is encountered?

…I chars as we can before we switch to ICU/NLS

EgorBo · 2022-11-13T19:13:21Z

numbers.

What's the overhead when non-ASCII is encountered?

The worst case for this algorithm is a short (>7 chars) full non-ASCII string, but the cost of SIMD "is ascii" check is not too big.
ASCII with non-ASCII can be faster with this PR depending how many ASCII characters we manage to process in SIMD before we encounter non-ASCII.

public IEnumerable<string> TestData()
{
    // worst case: short full non-ASCII
    yield return "Привет Мир";
    yield return "ASCII-string with non-ASCII chars: ыц!";
}

private static readonly char[] OutputBuffer = new char[1024];

[Benchmark]
[ArgumentsSource(nameof(TestData))]
public void ToLowerInvariant(string str) => str.AsSpan().ToLowerInvariant(OutputBuffer.AsSpan());

Method	Toolchain	str	Mean
ToLowerInvariant	\Core_Root\corerun.exe	Привет Мир	52.19 ns
ToLowerInvariant	\Core_Root_base\corerun.exe	Привет Мир	48.52 ns

ToLowerInvariant	\Core_Root\corerun.exe	ASCII(...): ыц! [38]	32.36 ns
ToLowerInvariant	\Core_Root_base\corerun.exe	ASCII(...): ыц! [38]	34.92 ns

I've pushed a change to call the Scalar path if we encounter non-ASCII in a vector - because we still want to process as many ASCII chars as we can before we switch to extremely slow NLS/ICU fallback.

EgorBo · 2022-11-29T17:07:45Z

Improvements on x64:

[Perf] Alpine/x64: 5 Improvements on 11/24/2022 9:42:44 PM perf-autofiling-issues#10118
[Perf] Windows/x64: 4 Improvements on 11/24/2022 9:42:44 PM perf-autofiling-issues#10132
[Perf] Windows/x64: 5 Improvements on 11/24/2022 9:42:44 PM perf-autofiling-issues#10226

ghost assigned EgorBo Nov 12, 2022

am11 reviewed Nov 12, 2022

View reviewed changes

src/libraries/Common/tests/Tests/System/StringTests.cs Outdated Show resolved Hide resolved

am11 reviewed Nov 12, 2022

View reviewed changes

src/libraries/Common/tests/Tests/System/StringTests.cs Outdated Show resolved Hide resolved

gfoidl reviewed Nov 12, 2022

View reviewed changes

src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs Show resolved Hide resolved

src/libraries/System.Private.CoreLib/src/System/Text/Unicode/Utf16Utility.cs Show resolved Hide resolved

jkotas reviewed Nov 12, 2022

View reviewed changes

src/libraries/System.Private.CoreLib/src/System/Globalization/TextInfo.cs Outdated Show resolved Hide resolved

Span-only impl

be363c0

EgorBo force-pushed the simd-lowercaser branch from 43b97a4 to be363c0 Compare November 12, 2022 16:27

fix test failures

18811ca

EgorBo added 2 commits November 13, 2022 18:55

Merge branch 'main' of github.com:dotnet/runtime into simd-lowercaser

567ccf0

Use scalar path in case of non-ASCII: we want to process as many ASCI…

f6484ac

…I chars as we can before we switch to ICU/NLS

EgorBo added 2 commits November 13, 2022 20:17

Apply am11's suggestion

6d0764c

Merge branch 'main' of github.com:dotnet/runtime into simd-lowercaser

a98a186

build-analysis bot mentioned this pull request Nov 17, 2022

System.IO.Pipes.Tests.NamedPipeTest_Specific.ClientConnectAsync_Cancel_With_InfiniteTimeout failing on Libraries Test Run release coreclr windows x86 Release #69101

Open

stephentoub approved these changes Nov 24, 2022

View reviewed changes

EgorBo merged commit 4b6380d into dotnet:main Nov 24, 2022

EgorBo deleted the simd-lowercaser branch November 24, 2022 21:42

EgorBo mentioned this pull request Nov 29, 2022

[Perf] Windows/x64: 1 Regression on 11/24/2022 9:42:44 PM dotnet/perf-autofiling-issues#10123

Closed

dakersnar mentioned this pull request Dec 1, 2022

[Perf] Windows/x86: 4 Regressions on 11/24/2022 9:42:44 PM dotnet/perf-autofiling-issues#10178

Closed

ghost locked as resolved and limited conversation to collaborators Dec 29, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

SIMDify ToLowerInvariant/ToUpperInvariant #78262

SIMDify ToLowerInvariant/ToUpperInvariant #78262

EgorBo commented Nov 12, 2022 •

edited

Loading

dotnet-issue-labeler bot commented Nov 12, 2022

stephentoub commented Nov 12, 2022

EgorBo commented Nov 13, 2022

EgorBo commented Nov 29, 2022 •

edited

Loading

SIMDify ToLowerInvariant/ToUpperInvariant #78262

SIMDify ToLowerInvariant/ToUpperInvariant #78262

Conversation

EgorBo commented Nov 12, 2022 • edited Loading

dotnet-issue-labeler bot commented Nov 12, 2022

stephentoub commented Nov 12, 2022

EgorBo commented Nov 13, 2022

EgorBo commented Nov 29, 2022 • edited Loading

EgorBo commented Nov 12, 2022 •

edited

Loading

EgorBo commented Nov 29, 2022 •

edited

Loading