Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Custom opt levels and passes #1250

Merged
merged 1 commit into from
Sep 22, 2020

Conversation

lgritz
Copy link
Collaborator

@lgritz lgritz commented Sep 16, 2020

Work of Alex Wells, Intel.

Added custom optimization passes to improve AVX/AVX2 code generation
which are automatically added when targeting the host:

  • PrePromoteLogicalOpsOnBitMasks
  • PreventBitMasksFromBeingLiveinsToBasicBlocks

Added support for experimental llvm_optimize: 10,11,12,13. Optlevels
10, 11, 12, 13 explicitly create optimization passes. They are
stripped down versions of clang's -O0, -O1, -O2, -O3. They try to
provide similar results with improved optimization time by removing
some expensive passes that were repeated many times and omitting other
passes that are not applicable or not profitable. Useful for
debugging, optlevel 10 adds next to no additional passes.

Copy link
Contributor

@AlexMWells AlexMWells left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@lgritz
Copy link
Collaborator Author

lgritz commented Sep 22, 2020

Updated after spending some time chasing a bug I saw only on my Mac laptop (but, curiously, not on the CI Mac test), which was a crash and LLVM error messages about the custom passes being added twice. This took me a long time to track down and basically came down to the file-level static initializers in llvm_util.cpp being run twice in some circumstances. I still don't understand why, but I solved the problem by moving those llvm::RegisterPass from file level statics to statics inside LLVM_Util::SetupLLVM(), which definitely only run once.

Also, just to note here, that one way I altered Alex's code was in testshade, instead of many different options to control the llvm optimization level (-llvm_O1, -llvm_O....), I made just a single --llvm_opt taking an integer argument (also makes things easier if we make more opt levels later).

@lgritz
Copy link
Collaborator Author

lgritz commented Sep 22, 2020

When this passes the CI tests, I will finish the merge.

Work of Alex Wells, Intel.

Added custom optimization passes to improve AVX/AVX2 code generation
which are automatically added when targeting the host:

  * PrePromoteLogicalOpsOnBitMasks
  * PreventBitMasksFromBeingLiveinsToBasicBlocks

Added support for experimental llvm_optimize: 10,11,12,13.  Optlevels
10, 11, 12, 13 explicitly create optimization passes.  They are
stripped down versions of clang's -O0, -O1, -O2, -O3.  They try to
provide similar results with improved optimization time by removing
some expensive passes that were repeated many times and omitting other
passes that are not applicable or not profitable.  Useful for
debugging, optlevel 10 adds next to no additional passes.

Signed-off-by: Larry Gritz <[email protected]>
@lgritz lgritz merged commit 3b57df4 into AcademySoftwareFoundation:master Sep 22, 2020
@lgritz lgritz deleted the alex-optpasses branch September 22, 2020 05:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants