-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Rewrite GetIndexOfFirstNonAsciiByte #104503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
- Deduplicate SIMD paths - there is no more need for that - Remove at least one branch from short length path - Use up to 128x4/256x2 SIMD unrolling
This comment was marked as outdated.
This comment was marked as outdated.
Benchmark results on Arm64
|
Benchmark results on Amd
|
…512b loop, mitigate Dynamic PGO impact
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
1 similar comment
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
This comment was marked as outdated.
Benchmark results on Arm64
|
Benchmark results on Amd
|
This comment was marked as outdated.
This comment was marked as outdated.
Benchmark results on Intel
|
src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Utility.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Utility.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Utility.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Utility.cs
Outdated
Show resolved
Hide resolved
src/libraries/System.Private.CoreLib/src/System/Text/Ascii.Utility.cs
Outdated
Show resolved
Hide resolved
|
@EgorBot -amd -arm64 using System.Text.Unicode;
using BenchmarkDotNet.Attributes;
using BenchmarkDotNet.Running;
// Let's try again
BenchmarkRunner.Run<AsciiBench>(args: args);
public class AsciiBench
{
[ParamsSource(nameof(Lengths))]
public int Length;
public IEnumerable<int> Lengths => [..Enumerable.Range(1, 8), 12, 16, 25, 50, 70, 200, 512, 16384];
byte[] ascii = [];
[GlobalSetup]
public void Setup()
{
ascii = new byte[Length];
ascii.AsSpan().Fill((byte)'f');
}
[Benchmark]
public bool ValidateAscii()
{
return Utf8.IsValid(ascii);
}
} |
Benchmark results on Intel
|
|
CC @jeffhandley As an area owner, PTAL when you have time. Thanks! |
|
@tannergooding will you continue reviewing this change? |
|
I chatted with @tannergooding about this. Right now, we're not certain if we want to take this because some folks are looking into different strategies for the use of unsafe code across the libraries. We expect that to be decided in 10.0 and that should help us determine if we move forward with this PR or need to it change. |
Any update if it is already decided if we'll go with different strategies for the use of unsafe code? Or do you know why can advise about that? |
|
@tarekgh it's been quite a while and, looking back, I think a lot of this is unnecessary complexity, I'm ok with just closing this PR. A better change would be simply replacing this entire path with What do you think? |
Description
The purpose of this change is to simplify
GetIndexOfFirstNonAsciiByteby servicing all targets with a single path, which is now possible, and achieving optimal throughput on longer lengths.Per "Validating gigabytes of Unicode strings per second… in C#?", it appears that we leave a lot of performance on the table for ARM64 target.
Also closes #89924
Analysis
After addressing feedback, this PR now improves maximum non-ASCII byte search throughput by 3x for AVX2 (Zen 3) and AdvSimd (Neoverse N1), and 2x for AVX512 (Ice Lake). Short lengths improve by up to 25%, and the worst case regression is on AMD only for <5% for lengths ~65-200 (under 1ns impact).
Benchmark
Results
Ice Lake, Zen 3 and N1: #104503 (comment)
M1 Pro