Skip to content

Conversation

jakobbotsch
Copy link
Member

Graph-based loop inversion works by inverting the loop such that we do not jump into the loop to immediately test a condition. However, we are not detecting the case where the loop is already inverted because its latch is an exit that tests a different condition. If the loop has multiple exits that means we may end up creating a suboptimal latch condition.

Detect this case based on the pattern-based IV analysis that we already have.

This fixes #119277 and likely fixes a bunch of the regressions tracked by #116754.

This PR is based on #119537. We may seriously consider whether a dumbed down version of this can be backported to .NET 10. For example, if the latch is an exit that looks like a simple IV increment, then skip loop inversion.

- Follow unique preds when locating the init statement for the iterator
  variable
- Generalize zero-trip detection by similarly following unique preds
Graph-based loop inversion works by inverting the loop such that we do
not jump into the loop to immediately test a condition. However, we are
not detecting the case where the loop is already inverted because its
latch is an exit that tests a different condition. If the loop has
multiple exits that means we may end up creating a suboptimal latch
condition.

Detect this case based on the pattern-based IV analysis that we already
have.
@github-actions github-actions bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Sep 10, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@jakobbotsch
Copy link
Member Author

@EgorBot -intel --filter "SeekUnroll"

@jakobbotsch
Copy link
Member Author

@EgorBot -intel --filter "QuickSortSpan"

@jakobbotsch
Copy link
Member Author

@EgorBot -intel --filter "QuickSortSpan" --disasm

@jakobbotsch
Copy link
Member Author

jakobbotsch commented Sep 12, 2025

On benchmarks.run_pgo we go from 7970 -> 7584 loops inverted, 1445 -> 1442 loops cloned, 342 -> 411 loops unrolled. Need to look at some of these loop cloning differences to spot check whether they make sense or not.

@jakobbotsch jakobbotsch marked this pull request as ready for review September 15, 2025 11:13
@Copilot Copilot AI review requested due to automatic review settings September 15, 2025 11:13
Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR improves JIT loop inversion optimization by avoiding the inversion of loops that are already optimally inverted with IV (induction variable) tests. The change prevents the creation of suboptimal latch conditions when a loop has multiple exits and one exit is already serving as an effective latch with an IV test.

Key changes:

  • Add detection logic to identify already-inverted loops with IV tests before attempting loop inversion
  • Remove unused parameter from optExtractTestIncr method to simplify the interface
  • Update all call sites to use the simplified method signature

Reviewed Changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File Description
src/coreclr/jit/optimizer.cpp Add IV test detection logic to skip loop inversion and remove unused header parameter from optExtractTestIncr
src/coreclr/jit/flowgraph.cpp Update call to optExtractTestIncr with simplified signature
src/coreclr/jit/compiler.h Update method declaration to remove unused header parameter

@jakobbotsch
Copy link
Member Author

The difference in loop cloning is just from a context where the base replays cleanly while the diff (after differences in inversion) no longer replays cleanly. In reality the partial replay of the diff does show 3 loop clonings, but those do not get recorded once it misses.

cc @dotnet/jit-contrib PTAL @AndyAyersMS

Diffs. As mentioned above we go from 7970 -> 7584 loops inverted (because we now detect they are already inverted with an "IV looking" condition) and 342 -> 411 loops unrolled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Perf] Windows/x64: 1 Regression on 8/26/2025 4:33:14 PM +00:00

2 participants