Ensure that we lookup the correct instruction for embedded masking/broadcast scenarios #117828

tannergooding · 2025-07-18T17:44:30Z

This resolves #117794

When doing the embedded masking/broadcast checks, we sometimes allow for changing the base type under the presumption we'd pick a different instruction. Even if we didn't get a new instruction, we'd still end up changing the base type which caused a codegen bug.

To resolve this, there is now an assert to validate we got a different instruction for the alternative base type. This is achieved by ensuring the existing lookup support returns the instruction optimistically.

We then peephole this back to the original instruction in codegen if we don't end up using any of the embedded features that would require it, since that allows us to have the smaller emitter output.

Copilot

Pull Request Overview

This PR resolves a codegen bug (#117794) related to embedded masking/broadcast scenarios in hardware intrinsics. The issue occurred when the embedded masking/broadcast checks would change the base type to pick a different instruction, but sometimes the same instruction would be returned, leading to incorrect base type usage in code generation.

Introduces a new lookupIns method in CodeGen that handles instruction selection with embedded features and peephole optimization
Adds assertions to validate that instruction changes occur when expected for embedded features
Refactors instruction lookup calls throughout the codebase to use the new method

Reviewed Changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
Runtime_117794.cs	Regression test case for the bug fix
Runtime_117794.csproj	Project file for the regression test
hwintrinsiccodegenxarch.cpp	Main implementation of new `lookupIns` method and refactored codegen
hwintrinsic.cpp	Updated instruction lookup to support optimistic EVEX instruction selection
hwintrinsic.h	Modified `lookupIns` signature to remove compiler parameter
instr.cpp	Removed embedded broadcast instruction mapping logic
gentree.cpp	Updated calls to use new instruction lookup method
codegenxarch.cpp	Updated instruction lookup calls
codegen.h	Updated method signatures

Comments suppressed due to low confidence (2)

src/coreclr/jit/hwintrinsiccodegenxarch.cpp:433

The instruction name INS_movdqa32 is inconsistent with the EVEX naming pattern. It should be INS_vmovdqa32 to match the pattern used for other EVEX instructions in this switch statement.

                ins = INS_movdqa32;

src/coreclr/jit/hwintrinsiccodegenxarch.cpp:439

The instruction name INS_movdqu32 is inconsistent with the EVEX naming pattern. It should be INS_vmovdqu32 to match the pattern used for other EVEX instructions in this switch statement.

                ins = INS_movdqu32;

dotnet-policy-service · 2025-07-18T17:45:25Z

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

src/coreclr/jit/hwintrinsic.cpp

tannergooding · 2025-07-18T18:15:37Z

CC. @dotnet/jit-contrib, @jakobbotsch

src/coreclr/jit/gentree.cpp

src/coreclr/jit/hwintrinsiccodegenxarch.cpp

…oadcast scenarios

tannergooding · 2025-07-19T13:23:40Z

(Trying to minimize the throughput impact, the overall code isn't really changing)

tannergooding · 2025-07-19T15:14:39Z

Happy with the changes now. TP impact is minimized (+0.01% in the worst case), no regressions for any disasm output (only improvements), and the tests are passing as expected.

tannergooding · 2025-07-19T15:15:25Z

/azp run Fuzzlyn

azure-pipelines · 2025-07-19T15:15:38Z

Azure Pipelines successfully started running 1 pipeline(s).

saucecontrol

Nice change! I'll take another look at optimizing for broadcast size after this lands.

src/coreclr/jit/codegenxarch.cpp

tannergooding · 2025-07-22T03:54:06Z

/ba-g unrelated arm64 timeouts

Copilot AI review requested due to automatic review settings July 18, 2025 17:44

github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jul 18, 2025

Copilot AI reviewed Jul 18, 2025

View reviewed changes

dotnet-policy-service bot assigned tannergooding Jul 18, 2025