Skip to content

Conversation

@efric
Copy link
Member

@efric efric commented Dec 23, 2025

The expand_dims lowering config attribute relies on linalg::populateFoldReshapeOpsByExpansionPatterns to fuse tensor.expand_shape/tensor.collapse_shape into the linalg op in GPUExpandDimensions. This requires all indexing maps to be projected permutations. This patch adds a check to verify this before setting the attribute.

Not sure why this wasn't caught in the PR CI but fixes compilation issue of densenet-12 in test_onnx_models::amdgpu_hip_rdna3.

Signed-off-by: Eric Feng <[email protected]>
Signed-off-by: Eric Feng <[email protected]>
Signed-off-by: Eric Feng <[email protected]>
@efric efric changed the title [GPU][Codegen] Skip expand_dims for ops with nonprojected permutation indexing maps [GPU][Codegen] Skip dimension expansion for ops with nonfusable indexing maps Dec 23, 2025
@efric efric marked this pull request as ready for review December 23, 2025 08:34
Comment on lines +463 to +472
%filled = linalg.fill ins(%cst : f16) outs(%empty : tensor<4xf16>) -> tensor<4xf16>
%result = linalg.generic {
indexing_maps = [
affine_map<(d0, d1) -> (d1, 0, 0)>,
affine_map<(d0, d1) -> (d0, d1)>,
affine_map<(d0, d1) -> (d0)>
], iterator_types = ["parallel", "reduction"]
} ins(%in0, %in1 : tensor<16384x1x1xf16>, tensor<4x16384xf16>)
outs(%filled : tensor<4xf16>) {
^bb0(%a: f16, %b: f16, %out: f16):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think something else is wrong if we are getting these kinds of indexing maps in our codegen. There is a pass that runs which should remove constants in indexing maps. Can you mentioned the full example instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, makes sense. This was observed here https://github.com/iree-org/iree/actions/runs/20449877963/job/58761041931#step:8:166 but I will look more into it today.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After digging into it more, I think the root issue is from the most recent integrate #22943, likely involving the changes to iree-dispatch-creation-fold-unit-extent-dims from #22921. In particular, unit dims are not being properly folded away during GlobalOptimization. I think it didn’t surface until after the dimension expansion PR got merged because by luck we didn’t run into cases where this was a problem. Filed issue along with reproduction steps here: #22978. Closing this PR.

@efric efric closed this Dec 24, 2025
efric added a commit that referenced this pull request Jan 19, 2026
…rmutation maps (#23200)

The `expand_dims` lowering config attribute relies on
`linalg::populateFoldReshapeOpsByExpansionPatterns` to fuse
`tensor.expand_shape`/`tensor.collapse_shape` into the linalg op in
`GPUExpandDimensions`. This requires all indexing maps to be projected
permutations. This patch adds a check to verify this before setting the
attribute. (Originally opened in #22955, but we do want this after all).

Fixes: #23185

---------

Signed-off-by: Eric Feng <[email protected]>
keshavvinayak01 pushed a commit that referenced this pull request Jan 27, 2026
…rmutation maps (#23200)

The `expand_dims` lowering config attribute relies on
`linalg::populateFoldReshapeOpsByExpansionPatterns` to fuse
`tensor.expand_shape`/`tensor.collapse_shape` into the linalg op in
`GPUExpandDimensions`. This requires all indexing maps to be projected
permutations. This patch adds a check to verify this before setting the
attribute. (Originally opened in #22955, but we do want this after all).

Fixes: #23185

---------

Signed-off-by: Eric Feng <[email protected]>
Signed-off-by: Keshav Vinayak Jha <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants