Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

JIT: maintain pred lists during loop unrolling #80625

Merged
merged 1 commit into from
Jan 18, 2023

Conversation

AndyAyersMS
Copy link
Member

@AndyAyersMS AndyAyersMS commented Jan 13, 2023

We now update pred lists during loop unrolling, rather than recomputing them from scratch.

There are several parts to the fix: first, optRedirectBlock now has a new ability to add pred references for the flow from a newly cloned block, be it either to a remapped successor or a non-remapped successor. Along with this we no longer copy over the block ref count in CloneBlockState. These changes allow us to create the right pred links and ref counts in the interior of a cloned subgraph.

Second, we now scrub block references from the original loop body blocks instead of just setting their ref counts to zero.

Finally, we fix up references for exterior flow into and out of the unroll complex.

Addresses one of the cases mentioned in #49030.

We now update pred lists during loop unrolling, rather than recomputing
them from scratch.

There are several parts to the fix: first, `optRedirectBlock' now has
a new ability to add pred references for the flow from a newly cloned
block, be it either to a remapped successor or a non-remapped successor.
Along with this we no longer copy over the block ref count in `CloneBlockState`.
These changes allow us to create the right pred links and ref counts in the
interior of a cloned subgraph.

Second, we now scrub block references from the original loop body blocks
instead of just setting their ref counts to zero.

Finally, we fix up references for exterior flow into and out of the unroll
complex.

Addresses one of the cases mentioned in dotnet#49030.
@ghost ghost assigned AndyAyersMS Jan 13, 2023
@dotnet-issue-labeler dotnet-issue-labeler bot added the area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI label Jan 13, 2023
@ghost
Copy link

ghost commented Jan 13, 2023

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch, @kunalspathak
See info in area-owners.md if you want to be subscribed.

Issue Details

We now update pred lists during loop unrolling, rather than recomputing them from scratch.

There are several parts to the fix: first, optRedirectBlock' now has a new ability to add pred references for the flow from a newly cloned block, be it either to a remapped successor or a non-remapped successor. Along with this we no longer copy over the block ref count in CloneBlockState`. These changes allow us to create the right pred links and ref counts in the interior of a cloned subgraph.

Second, we now scrub block references from the original loop body blocks instead of just setting their ref counts to zero.

Finally, we fix up references for exterior flow into and out of the unroll complex.

Addresses one of the cases mentioned in #49030.

Author: AndyAyersMS
Assignees: AndyAyersMS
Labels:

area-CodeGen-coreclr

Milestone: -

@AndyAyersMS
Copy link
Member Author

@BruceForstall PTAL
cc @dotnet/jit-contrib

Also contains an unrelated jit dump fix for SSA.

This was annoyingly hard to get right; the main culprits being

  • the proper way to "gut" the existing loop body -- ended up needing two passes
  • handling the case where we're actually removing the loop

Also will note some ideas for future cleanup to the unroller. These may deserve a new issue or notes on existing issues:

  • We are seemingly too aggressive unmarking alignment; if we unroll a loop with nested aligned loops it seems like those cloned nested loops should continue to be aligned
  • As noted in the code the profile update doesn't make sense. Will try and fix this subsequently
  • Seems like the "zero" case should be handled elsewhere/separately?
  • Seems like the "one" case we shouldn't need to clone? And in general perhaps we should be making N-1 copies of the IR instead of making N and then trashing the original. I get that we may not get constant prop right away but it may not matter.

@AndyAyersMS
Copy link
Member Author

AndyAyersMS commented Jan 13, 2023

Also I will work on fixing the other places we trash pred lists, so have reassigned #49030.

@AndyAyersMS
Copy link
Member Author

Hmm, I am seeing some SPMI diffs locally, which is a bit surprising -- let me verify I have a good baseline jit before I start digging in deeper.

@AndyAyersMS
Copy link
Member Author

Hmm, I am seeing some SPMI diffs locally, which is a bit surprising -- let me verify I have a good baseline jit before I start digging in deeper.

Looks like I didn't rebuild my diff branch after rebasing onto main. No diffs (as expected) once it was rebuilt.

@AndyAyersMS
Copy link
Member Author

Diffs—as expected, no code diffs.

Looks like TP is a wash, extra cost maintaining preds evidently balances savings from not rebuilding preds. I thought it might be a bit cheaper to maintain.

@AndyAyersMS
Copy link
Member Author

@BruceForstall FYI I have the other phases that drop preds list fixed (building on top of these changes), but will keep them as a follow-up PR.

Copy link
Member

@BruceForstall BruceForstall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

}

// We fall into to this unroll iteration from the bottom block (first iteration)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit

Suggested change
// We fall into to this unroll iteration from the bottom block (first iteration)
// We fall into this unroll iteration from the bottom block (first iteration)

?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll fix this in the next PR...

@AndyAyersMS AndyAyersMS merged commit e2d192a into dotnet:main Jan 18, 2023
@AndyAyersMS AndyAyersMS mentioned this pull request Jan 20, 2023
44 tasks
@ghost ghost locked as resolved and limited conversation to collaborators Feb 17, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants