[Integrate] Drop llvm/llvm-project@b4c31dc revert.#21851
Merged
Conversation
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Contributor
Author
|
To repro: func.func @attention_2_1024_128_128_64_dtype_f16_f16_f16_f16(%query: tensor<2x1024x128xf16>, %key: tensor<2x128x128xf16>, %value: tensor<2x128x64xf16>, %scale: f32) -> tensor<2x1024x64xf16> {
%result0 = tensor.empty(): tensor<2x1024x64xf16>
%scale_f16 = arith.truncf %scale : f32 to f16
%result1 = iree_linalg_ext.attention {
indexing_maps = [affine_map<(batch, m, n, k1, k2) -> (batch, m, k1)>,
affine_map<(batch, m, n, k1, k2) -> (batch, k2, k1)>,
affine_map<(batch, m, n, k1, k2) -> (batch, k2, n)>,
affine_map<(batch, m, n, k1, k2) -> ()>,
affine_map<(batch, m, n, k1, k2) -> (batch, m, n)>]
} ins(%query, %key, %value, %scale_f16: tensor<2x1024x128xf16>, tensor<2x128x128xf16>, tensor<2x128x64xf16>, f16)
outs(%result0: tensor<2x1024x64xf16>) {
^bb0(%score: f32):
iree_linalg_ext.yield %score : f32
} -> tensor<2x1024x64xf16>
return %result1: tensor<2x1024x64xf16>
} |
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Signed-off-by: hanhanW <hanhan0912@gmail.com>
kuhar
reviewed
Sep 4, 2025
Contributor
Author
|
@kuhar @MaheshRavishankar the CI is green. The cherry-picked commit is landed in upstream. I think we can land the PR, if you can help review the changes. |
kuhar
approved these changes
Sep 5, 2025
Signed-off-by: hanhanW <hanhan0912@gmail.com>
Groverkss
reviewed
Sep 5, 2025
Comment on lines
+187
to
+188
| Value shapeCast = rewriter.create<vector::ShapeCastOp>( | ||
| op.getLoc(), vec1DType, op.getSource()); |
Contributor
There was a problem hiding this comment.
I don't know if flattening was the right call, this shouldve been an unrolling pattern. Everything else does unrolling.
Member
There was a problem hiding this comment.
I think it's made explicit this is mostly a stopgap patch
hhkit
pushed a commit
to opencompl/iree
that referenced
this pull request
Sep 11, 2025
It carries a cherry-pick fix that gets the operands from the adaptor: - iree-org/llvm-project@8b88014 Changes: - Update most lit tests to check `vector.from_elements`. - Add unrolling patterns to the final conversion. - Implement n-D `vector::ToElementsOp` lowering, which will be dropped after llvm/llvm-project#156992 is landed. It should be added to all the backends, but somehow only AMDGPU backend needs the pattern. The other backends may address the issue via specialized tiling config + dropping vector unit dim patterns. --------- Signed-off-by: hanhanW <hanhan0912@gmail.com> Signed-off-by: Ivan Ho <ivan.ho@cl.cam.ac.uk>
hanhanW
added a commit
to hanhanW/iree
that referenced
this pull request
Sep 16, 2025
It is a follow-up to iree-org#21851. The CPU backend also needs the patterns, which is not surprising. It was not detected in the first place because a test did not exist. It only happens in dynamic gather+attention cases. The test is failing on VMVX and SPIR-V backend. AMDGPU does not have a test suite, but it can compile the program. Thus, the test is only added to CPU for now. Signed-off-by: hanhanW <hanhan0912@gmail.com>
hanhanW
added a commit
that referenced
this pull request
Sep 17, 2025
…2010) It is a follow-up to #21851. The CPU backend also needs the patterns, which is not surprising. It was not detected in the first place because a test did not exist. It only happens in dynamic gather+attention cases. The test is failing on VMVX and SPIR-V backend. Thus, the test is only added to CPU and hip for now. Fixes #22007 --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>
bangtianliu
pushed a commit
to bangtianliu/iree
that referenced
this pull request
Sep 22, 2025
…ee-org#22010) It is a follow-up to iree-org#21851. The CPU backend also needs the patterns, which is not surprising. It was not detected in the first place because a test did not exist. It only happens in dynamic gather+attention cases. The test is failing on VMVX and SPIR-V backend. Thus, the test is only added to CPU and hip for now. Fixes iree-org#22007 --------- Signed-off-by: hanhanW <hanhan0912@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
It carries a cherry-pick fix that gets the operands from the adaptor:
Changes:
vector.from_elements.vector::ToElementsOplowering, which will be dropped after [mlir][vector] Add support for lowering n-D vector.to_elements op. llvm/llvm-project#156992 is landed. It should be added to all the backends, but somehow only AMDGPU backend needs the pattern. The other backends may address the issue via specialized tiling config + dropping vector unit dim patterns.