-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Closed
Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIavx512Related to the AVX-512 architectureRelated to the AVX-512 architectureoptimization
Milestone
Description
When compiling for AVX512 RyuJIT uses vmovdqu32
to efficiently zero 64 bytes at a time, but it does not use 32 byte zeroing like on AVX2 and falls back straight to 16 byte zeroing for remaining zeroing needs.
sub rsp, 168
vxorps xmm4, xmm4, xmm4
vmovdqu32 zmmword ptr [rsp+0x20], zmm4
vmovdqa xmmword ptr [rsp+0x60], xmm4 ; this should be vmovdqu ymmword ptr [rsp+0x60], ymm4
vmovdqa xmmword ptr [rsp+0x70], xmm4 ; this could be omitted then
vmovdqa xmmword ptr [rsp+0x80], xmm4
...
MineCake147E
Metadata
Metadata
Assignees
Labels
area-CodeGen-coreclrCLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMICLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMIavx512Related to the AVX-512 architectureRelated to the AVX-512 architectureoptimization