Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[x86] Generate AVX512 fixed-point instructions #7129

Merged
merged 13 commits into from
Oct 31, 2022
Merged

Conversation

rootjalex
Copy link
Member

@rootjalex rootjalex commented Oct 26, 2022

This PR adds support for generating saturating_(add | sub) and pmulh(rs) on Skylake and Cannonlake (i.e. for AVX512BW). It also increases simd_op_check test coverage of fixed-point operations on those archs.

I also did a bit of clean-up on the way:

I did not add abs to codegen because it doesn't appear that LLVM currently exposes non-masked versions of AVX512 abs variants.

Fixes #7002

@rootjalex rootjalex requested a review from abadams October 26, 2022 22:49
src/CodeGen_X86.cpp Show resolved Hide resolved
src/CodeGen_X86.cpp Outdated Show resolved Hide resolved
@steven-johnson
Copy link
Contributor

Several legit failures here

@rootjalex
Copy link
Member Author

Can't quite figure out why the JIT doesn't like ssse3.pabs instructions. I see them used in LLVM tests (i.e. here). Gonna revert the use of those for now, but will still change the .ll to use llvm.abs.

@rootjalex
Copy link
Member Author

Ugh, same deal with the avx2.pabs instructions (despite showing up in LLVM tests here). I will revert that change and add a comment, but I don't know why these intrinsics in particular are an issue.

@rootjalex
Copy link
Member Author

Just updated the AVX512_Skylake pabs generation, with a fix to complete_x86_target thanks to @abadams

@rootjalex
Copy link
Member Author

Only test failure appears unrelated

src/CodeGen_X86.cpp Outdated Show resolved Hide resolved
Copy link
Contributor

@steven-johnson steven-johnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tests pass on my AVX512 Linux box

@rootjalex rootjalex merged commit 5da5dfd into main Oct 31, 2022
@rootjalex rootjalex deleted the rootjalex/x86-fp-cleanup branch October 31, 2022 18:36
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
* clean-up abs and saturating_pmulhrs, fix AVX512 saturating_ ops

* add test coverage for AVX512 fp ops

* generate vpabs on AVX512

* faster AVX2 lowering of saturating_pmulhrs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Saturating instructions not generated on AVX512
3 participants