Switch to ymm after zmm in genZeroInitFrameUsingBlockInit #115981

EgorBo · 2025-05-25T23:08:55Z

 vxorps   xmm4, xmm4, xmm4
 vmovdqu32 zmmword ptr [rsp+0x20], zmm4
-vmovdqa  xmmword ptr [rsp+0x60], xmm4
-vmovdqa  xmmword ptr [rsp+0x70], xmm4
+vmovdqu  ymmword ptr [rsp+0x60], ymm4
 vmovdqu  xmmword ptr [rsp+0x80], xmm4

dotnet-policy-service · 2025-05-25T23:09:39Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

EgorBo · 2025-05-26T11:03:58Z

PTAL @dotnet/jit-contrib small change, a couple of diffs (only reproduces on avx512)

Copilot

Pull Request Overview

This PR refactors the zero-initialization of stack frames to use YMM after ZMM, consolidates loops into a single while-based approach, and replaces aligned vmovdqa with unaligned vmovdqu for YMM registers.

Replaces two for-loops with a while-loop driven by lenRemaining
Computes regSize dynamically via roundDownSIMDSize and chooses aligned vs. unaligned moves
Introduces ALIGN_UP(blkSize, 16) to drive the loop and switches mov instructions

Comments suppressed due to low confidence (1)

src/coreclr/jit/codegenxarch.cpp:11261

Add tests for block sizes not divisible by SIMD widths (e.g., sizes between 1–15, 17–31 bytes) to verify that remainders are handled correctly by this loop.

while (lenRemaining > 0)

src/coreclr/jit/codegenxarch.cpp

tannergooding · 2025-05-27T00:27:16Z

src/coreclr/jit/codegenxarch.cpp


-            assert(i == blkSize);
+                // frameReg is definitely not known to be 32B/64B aligned -> switch to unaligned movs
+                instruction ins    = regSize > XMM_REGSIZE_BYTES ? simdUnalignedMovIns() : simdMov;


Does simdUnalignedMovIns() get hoisted out of the loop, or will it be a lookup each time?

Is there a reason to not just always use the unaligned instruction since they're the same perf for accesses that are actually aligned?

I presume it acts as a validation of the assumption that the frame pointer is 16 bytes aligned, but not sure, I just copied it from the previous logic

EgorBo · 2025-05-27T19:06:25Z

/ba-g "windows-x86 Debug Libraries_CheckedCoreCLR is stuck"

switch to ymm after zmm in genZeroInitFrameUsingBlockInit

73e4ba7

github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label May 25, 2025

dotnet-policy-service bot assigned EgorBo May 25, 2025

EgorBo added 2 commits May 26, 2025 01:23

use aligned movs

a80f572

fix

031a190

MihuBot mentioned this pull request May 26, 2025

[JitDiff X64] [EgorBo] Switch to ymm after zmm in genZeroInitFrameUsingBlockInit MihuBot/runtime-utils#1088

Open

build-analysis bot mentioned this pull request May 26, 2025

The Operation will be canceled. The next steps may not contain expected logs. dotnet/dnceng#3008

Open

3 tasks

EgorBo marked this pull request as ready for review May 26, 2025 11:04

Copilot AI review requested due to automatic review settings May 26, 2025 11:04

Copilot AI reviewed May 26, 2025

View reviewed changes

src/coreclr/jit/codegenxarch.cpp Outdated Show resolved Hide resolved

src/coreclr/jit/codegenxarch.cpp Outdated Show resolved Hide resolved

jakobbotsch reviewed May 26, 2025

View reviewed changes

src/coreclr/jit/codegenxarch.cpp Outdated Show resolved Hide resolved

tannergooding reviewed May 26, 2025

View reviewed changes

src/coreclr/jit/codegenxarch.cpp Outdated Show resolved Hide resolved

EgorBo added 2 commits May 26, 2025 19:24

Address feedback

73b051a

formatting

20a4581

tannergooding reviewed May 27, 2025

View reviewed changes

tannergooding approved these changes May 27, 2025

View reviewed changes

EgorBo added 2 commits May 27, 2025 04:38

Address feedback

8e2d620

formatting

1f4387e

EgorBo enabled auto-merge (squash) May 27, 2025 17:17

EgorBo merged commit 97d1fc2 into dotnet:main May 27, 2025
106 of 108 checks passed

LoopedBard3 mentioned this pull request Jun 3, 2025

[Perf] Linux/x64: 21 Regressions on 5/27/2025 9:50:26 PM +00:00 #116272

Closed

github-actions bot locked and limited conversation to collaborators Jun 27, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Switch to ymm after zmm in genZeroInitFrameUsingBlockInit #115981

Switch to ymm after zmm in genZeroInitFrameUsingBlockInit #115981

Uh oh!

EgorBo commented May 25, 2025 •

edited

Loading

Uh oh!

dotnet-policy-service bot commented May 25, 2025

Uh oh!

EgorBo commented May 26, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tannergooding May 27, 2025

Uh oh!

EgorBo May 27, 2025

Uh oh!

EgorBo commented May 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Switch to ymm after zmm in genZeroInitFrameUsingBlockInit #115981

Switch to ymm after zmm in genZeroInitFrameUsingBlockInit #115981

Uh oh!

Conversation

EgorBo commented May 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dotnet-policy-service bot commented May 25, 2025

Uh oh!

EgorBo commented May 26, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

tannergooding May 27, 2025

Choose a reason for hiding this comment

Uh oh!

EgorBo May 27, 2025

Choose a reason for hiding this comment

Uh oh!

EgorBo commented May 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

EgorBo commented May 25, 2025 •

edited

Loading