-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Switch from direct intrinsics usage to Vector/Vector64/Vector128/Vector256 #64451
Comments
Tagging subscribers to this area: @dotnet/area-system-runtime-intrinsics Issue DetailsWe have a bunch of implementations in the core libraries that make direct use of intrinsics. .NET 7 includes cross-platform support via Vector/Vector64/Vector128/Vector256, and we prefer to use those whenever possible over using the intrinsics directly. This issue tracks converting all of our implementations over, or on a case-by-case basis deciding they can't or shouldn't be updated:
|
The |
I listed everything directly accessing intrinsics. As noted, not everything here may need changes ("or on a case-by-case basis deciding they can't or shouldn't be updated"). |
I know, just told it to strike them as false positives. |
Thanks for logging this @stephentoub! I've checked off a few entries above. Most notably, I've checked off the following as they represent the public xplat API (and as @teo-tsirpanis indicated are generally the scalar version, not the SIMD):
Since these are the "public xplat API" for those, we could potentially recognize them as JIT helper intrinsics to ensure "optimal codegen" and not have to rely on inlining, etc; but that's probably something worth looking at independently. |
Are you looking at |
@stephentoub |
That was addressed by #64899. |
Still more work to be done here for .NET 9+, so moving to future. |
We have a bunch of implementations in the core libraries that make direct use of intrinsics. .NET 7 includes cross-platform support via Vector/Vector64/Vector128/Vector256, and we prefer to use those whenever possible over using the intrinsics directly. This issue tracks converting all of our implementations over, or on a case-by-case basis deciding they can't or shouldn't be updated:
ASCIIUtility.ContainsNonAsciiByte(Vector12<byte>)
ASCIIUtility.GetIndexOfFirstNonAsciiByte(byte*, nuint)
ASCIIUtility.GetIndexOfFirstNonAsciiChar(char*, nuint)
ASCIIUtility.NarrowUtf16ToAscii(char*, byte*, nuint)
(try to port ASCIIUtility.NarrowUtf16ToAscii to x-plat intrinsics #73064)ASCIIUtility.WidenAsciiToUtf16(byte*, char*, nuint)
(try to port ASCIIUtility.WidenAsciiToUtf16 to x-plat intrinsics #73055)Base64.DecodeFromUtf8(ReadOnlySpan<byte>, Span<byte>, out int, out int, bool)
Base64.EncodeToUtf8(ReadOnlySpan<byte>, Span<byte>, out int, out int, bool)
BitArray(bool[])
(Begin using the xplat hardware intrinsics in BitArray #63722)BitArray.And(BitArray)
(Begin using the xplat hardware intrinsics in BitArray #63722)BitArray.CopyTo(Array, int)
BitArray.Not(BitArray)
(Begin using the xplat hardware intrinsics in BitArray #63722)BitArray.Or(BitArray)
(Begin using the xplat hardware intrinsics in BitArray #63722)BitArray.Xor(BitArray)
(Begin using the xplat hardware intrinsics in BitArray #63722)BitConverter.DoubleToInt64Bits(double)
BitConverter.Int32BitsToSingle(int)
BitConverter.Int64BitsToDouble(long)
BitConverter.SingleToInt32Bits(float)
BitOperations.LeadingZeroCount(uint)
BitOperations.LeadingZeroCount(ulong)
BitOperations.Log2(uint)
BitOperations.Log2(ulong)
BitOperations.PopCount(uint)
BitOperations.PopCount(ulong)
BitOperations.RoundUpToPowerOf2(uint)
BitOperations.RoundUpToPowerOf2(ulong)
BitOperations.TrailingZeroCount(ulong)
Decimal.VarDecFromR4(float, out DecCalc)
Decimal.VarDecFromR8(double, out DecCalc)
HexConverter.EncodeToUtf16(ReadOnlySpan<byte>, Span<char>, Casing)
Latin1Utility.GetIndexOfFirstNonLatin1Char(char*, nuint)
Latin1Utility.NarrowUtf16ToLatin1(char*, byte*, nuint)
Latin1Utility.WidenLatin1ToUtf16(byte*, char*, nuint)
Math.BigMul(ulong, ulong, out ulong)
Math.CopySign(double, double)
Math.ReciprocalEstimate(double)
Math.ReciprocalSqrtEstimate(double)
MathF.CopySign(float, float)
MathF.ReciprocalEstimate(float)
MathF.ReciprocalSqrtEstimate(float)
Matrix4x4.Invert(Matrix4x4, out Matrix4x4)
Matrix4x4.Lerp(Matrix4x4, Matrix4x4, float)
Matrix4x4.operator !=(Matrix4x4, Matrix4x4)
Matrix4x4.operator -(Matrix4x4)
Matrix4x4.operator -(Matrix4x4, Matrix4x4)
Matrix4x4.operator *(Matrix4x4, float)
Matrix4x4.operator *(Matrix4x4, Matrix4x4)
Matrix4x4.operator +(Matrix4x4, Matrix4x4)
Matrix4x4.operator ==(Matrix4x4, Matrix4x4)
Matrix4x4.Permute(Vector128<float>, byte)
Matrix4x4.Transpose(Matrix4x4)
OptimizedInboxTextEncoder.GetIndexOfFirstByteToEncode(ReadOnlySpan<byte> data)
OptimizedInboxTextEncoder.GetIndexOfFirstCharToEncode(ReadOnlySpan<char> data)
SpanHelpers.IndexOf(ref byte, byte, int)
(port SpanHelpers.IndexOf(ref byte, byte, int) to Vector128/256 #73364)SpanHelpers.IndexOf(ref char, char, int)
(port SpanHelpers.IndexOf(ref char, char, int) to Vector128/256 #73368)SpanHelpers.IndexOf(ref char, int, ref char, int)
SpanHelpers.IndexOfAny(ref byte, byte, byte, byte, int)
(Vectorize {Last}IndexOf{Any} and {Last}IndexOfAnyExcept without code duplication #73768)SpanHelpers.IndexOfAny(ref byte, byte, byte, int)
(port SpanHelpers.IndexOfAny(ref byte, byte, byte, int) to Vector128/256 #73384)SpanHelpers.IndexOfAny(ref char, char, char, char, char, char, int)
(Port IndexOfAny(ref char, char[1-5], int) to Vector128/256 #73469)SpanHelpers.IndexOfAny(ref char, char, char, char, char, int)
(Port IndexOfAny(ref char, char[1-5], int) to Vector128/256 #73469)SpanHelpers.IndexOfAny(ref char, char, char, char, int)
(Port IndexOfAny(ref char, char[1-5], int) to Vector128/256 #73469)SpanHelpers.IndexOfAny(ref char, char, char, int)
(Port IndexOfAny(ref char, char[1-5], int) to Vector128/256 #73469)SpanHelpers.{Last}IndexOf{Any}{Except}
(Vectorize {Last}IndexOf{Any} and {Last}IndexOfAnyExcept without code duplication #73768)SpanHelpers.SequenceCompareTo(ref byte, int, ref byte, int)
(Port SpanHelpers.SequenceCompareTo(ref byte, int, ref byte, int) to Vector128/256 #73475)SpanHelpers.SequenceEquals(ref byte, ref byte, nuint)
(Port SequenceEqual to crossplat Vectors, optimize vector compare on x64 #67202)SpanHelpers.Reverse(ref byte, nuint)
(Add Span.Reverse() intrinsic for Arm64 #72780)SpanHelpers.Reverse(ref char, nuint)
(Add Span.Reverse() intrinsic for Arm64 #72780)SpanHelpers.Reverse(ref int, nuint)
(Add Span.Reverse() intrinsic for Arm64 #72780)String.MakeSeparatorList(ReadOnlySpan<char>, ref ValueListBuilder<int>)
Utf8Utility.GetPointerToFirstInvalidByte(byte*, int, out int, out int)
Utf8Utility.TranscodeToUtf8(char*, int, byte*, int, out char*, out byte*)
VectorMath.ConditionalSelectBitwise(Vector128<double>, Vector128<double>, Vector128<double>)
VectorMath.ConditionalSelectBitwise(Vector128<float>, Vector128<float>, Vector128<float>)
VectorMath.Equal(Vector128<float>, Vector128<float>)
VectorMath.Lerp(Vector128<float>, Vector128<float>, Vector128<float>)
VectorMath.NotEqual(Vector128<float>, Vector128<float>)
The text was updated successfully, but these errors were encountered: