-
Notifications
You must be signed in to change notification settings - Fork 5.2k
Find sccs #120534
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Find sccs #120534
Conversation
|
Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch |
|
TP cost if we run this (when optimizing) only if there are improper headers is small, except for arm32. I wonder if the "failed to canonicalize" loops are getting mixed in here more often? Of course if we were to do this for real we might need to run it in minopts, too, and presumably there the cost would be somewhat higher as we don't build loops their yet either. |
|
diffs with the transformation enabled. About 0.25% TP, and some sizeable code size increases, but those are somewhat unavoidable and probably less than we'd see with schemes that are perf-oriented and duplicate existing code. Interestingly enough some code size improvements (one guess is that this transformation might at times make LSRA's life easier; perhaps fewer critical edges / friendlier block ordering). We can probably claw back a bit of the size by reusing the "transition" blocks (preds of the dispatch block); we'd need at most two per SCC entry (one for "outside" and one for "inside"). For something like wasm, we'd likely run this transformation later than I have it here currently, so TP would probably be somewhat cheaper (the added blocks/IR wouldn't gum up the optimization phases). We can't fully validate all SCCs are removed just yet (though we seem to get almost all) because the current loop finding code allows loops to include exceptional flow; for our use cases we don't need to consider this, so we need a tweak to loop finding. |
No description provided.