-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Open
Labels
api-approvedAPI was approved in API review, it can be implementedAPI was approved in API review, it can be implementedarea-System.Runtime.Intrinsicsavx512Related to the AVX-512 architectureRelated to the AVX-512 architecture
Milestone
Description
Background and motivation
AVX-512 IFMA
is supported by Intel in the Cannon Lake and newer architectures, and by AMD in Zen 4.
These instructions are known to be useful for cryptography and large number processing, and as a faster compromised alternative for VPMULLQ
instruction that finishes 5x slower on Intel CPUs compared to AMD Zen 4, as VPMADD52LUQ
finishes in only 4 clock cycles.
API Proposal
namespace System.Runtime.Intrinsics.X86
{
public abstract class Avx512Ifma : Avx512F
{
public static bool IsSupported { get; }
public static Vector512<ulong> MultiplyAdd52Low(Vector512<ulong> a, Vector512<ulong> b, Vector512<ulong> c);
public static Vector512<ulong> MultiplyAdd52High(Vector512<ulong> a, Vector512<ulong> b, Vector512<ulong> c);
public abstract class VL : Avx512F.VL
{
public static new bool IsSupported { get; }
public static Vector256<ulong> MultiplyAdd52Low(Vector256<ulong> a, Vector256<ulong> b, Vector256<ulong> c);
public static Vector256<ulong> MultiplyAdd52High(Vector256<ulong> a, Vector256<ulong> b, Vector256<ulong> c);
public static Vector128<ulong> MultiplyAdd52Low(Vector128<ulong> a, Vector128<ulong> b, Vector128<ulong> c);
public static Vector128<ulong> MultiplyAdd52High(Vector128<ulong> a, Vector128<ulong> b, Vector128<ulong> c);
}
}
}
API Usage
zmm0 = Avx512Ifma.MultiplyAdd52Low(zmm0, zmm2, zmm3);
zmm1 = Avx512Ifma.MultiplyAdd52High(zmm1, zmm2, zmm3);
An example of vectorized Montgomery reduction implementations using the equivalent C++ intrinsics:
Alternative Designs
Risks
None
Metadata
Metadata
Assignees
Labels
api-approvedAPI was approved in API review, it can be implementedAPI was approved in API review, it can be implementedarea-System.Runtime.Intrinsicsavx512Related to the AVX-512 architectureRelated to the AVX-512 architecture