Skip to content

[CPU] Populate to_elements unrolling patterns in LLVM conversion.#22010

Merged
hanhanW merged 4 commits intoiree-org:mainfrom
hanhanW:to-elements-lowering-cpu
Sep 17, 2025
Merged

[CPU] Populate to_elements unrolling patterns in LLVM conversion.#22010
hanhanW merged 4 commits intoiree-org:mainfrom
hanhanW:to-elements-lowering-cpu

Conversation

@hanhanW
Copy link
Contributor

@hanhanW hanhanW commented Sep 16, 2025

It is a follow-up to #21851. The CPU backend also needs the patterns, which is not surprising. It was not detected in the first place because a test did not exist. It only happens in dynamic gather+attention cases.

The test is failing on VMVX and SPIR-V backend. Thus, the test is only added to CPU and hip for now.

Fixes #22007

It is a follow-up to iree-org#21851. The
CPU backend also needs the patterns, which is not surprising. It was not
detected in the first place because a test did not exist. It only
happens in dynamic gather+attention cases.

The test is failing on VMVX and SPIR-V backend. AMDGPU does not have a
test suite, but it can compile the program. Thus, the test is only added
to CPU for now.

Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Copy link
Contributor

@amd-eochoalo amd-eochoalo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we also add it to

compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToNVVM.cpp

diff --git a/compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToNVVM.cpp b/compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToNVVM.cpp
index 8636d514a7..6bc6e27b5f 100644
--- a/compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToNVVM.cpp
+++ b/compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToNVVM.cpp
@@ -105,6 +105,7 @@ struct ConvertToNVVMPass final
       vector::populateVectorGatherLoweringPatterns(patterns);
       vector::populateVectorMaskOpLoweringPatterns(patterns);
       vector::populateVectorFromElementsLoweringPatterns(patterns);
+      vector::populateVectorToElementsLoweringPatterns(patterns);
       // We currently always use 64 bit indices, thus ensure the bit width of
       // the mask compare is consistent.
       vector::populateVectorMaskMaterializationPatterns(

@hanhanW
Copy link
Contributor Author

hanhanW commented Sep 17, 2025

Should we also add it to

compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToNVVM.cpp

diff --git a/compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToNVVM.cpp b/compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToNVVM.cpp
index 8636d514a7..6bc6e27b5f 100644
--- a/compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToNVVM.cpp
+++ b/compiler/src/iree/compiler/Codegen/LLVMGPU/ConvertToNVVM.cpp
@@ -105,6 +105,7 @@ struct ConvertToNVVMPass final
       vector::populateVectorGatherLoweringPatterns(patterns);
       vector::populateVectorMaskOpLoweringPatterns(patterns);
       vector::populateVectorFromElementsLoweringPatterns(patterns);
+      vector::populateVectorToElementsLoweringPatterns(patterns);
       // We currently always use 64 bit indices, thus ensure the bit width of
       // the mask compare is consistent.
       vector::populateVectorMaskMaterializationPatterns(

hmm, that is a mess and in maintenance mode; I don't really want to touch the code if not needed. I even don't know if we have CI for NVGPU or not. A solution may be having a utility that populates all the vector unrolling patterns, but I can see that the backend can diverge at some point. People should revamp the NVGPU codegen if they care about it. Adding new code to those non-active passes may be tech debt; nobody cares. (If someone cares, they'd raise a PR like this or file an issue.)

@amd-eochoalo
Copy link
Contributor

@hanhanW thanks for the answer! No worries, I was just double checking in case it was necessary :)

@hanhanW
Copy link
Contributor Author

hanhanW commented Sep 17, 2025

@hanhanW thanks for the answer! No worries, I was just double checking in case it was necessary :)

No worries, thanks for asking! I'm happy to share my understanding with reviewers and developers.

@hanhanW hanhanW merged commit 58c3da4 into iree-org:main Sep 17, 2025
49 of 68 checks passed
@hanhanW hanhanW deleted the to-elements-lowering-cpu branch September 17, 2025 21:02
bangtianliu pushed a commit to bangtianliu/iree that referenced this pull request Sep 22, 2025
…ee-org#22010)

It is a follow-up to iree-org#21851. The
CPU backend also needs the patterns, which is not surprising. It was not
detected in the first place because a test did not exist. It only
happens in dynamic gather+attention cases.

The test is failing on VMVX and SPIR-V backend. Thus, the test is only
added to CPU and hip for now.

Fixes iree-org#22007

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Compiler regression for llvm-cpu target in IREE tag 3.8.0rc20250909

3 participants