-
Notifications
You must be signed in to change notification settings - Fork 830
[GPU][Codegen] Skip dimension expansion for ops with nonfusable indexing maps #22955
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Eric Feng <[email protected]>
Signed-off-by: Eric Feng <[email protected]>
Signed-off-by: Eric Feng <[email protected]>
| %filled = linalg.fill ins(%cst : f16) outs(%empty : tensor<4xf16>) -> tensor<4xf16> | ||
| %result = linalg.generic { | ||
| indexing_maps = [ | ||
| affine_map<(d0, d1) -> (d1, 0, 0)>, | ||
| affine_map<(d0, d1) -> (d0, d1)>, | ||
| affine_map<(d0, d1) -> (d0)> | ||
| ], iterator_types = ["parallel", "reduction"] | ||
| } ins(%in0, %in1 : tensor<16384x1x1xf16>, tensor<4x16384xf16>) | ||
| outs(%filled : tensor<4xf16>) { | ||
| ^bb0(%a: f16, %b: f16, %out: f16): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think something else is wrong if we are getting these kinds of indexing maps in our codegen. There is a pass that runs which should remove constants in indexing maps. Can you mentioned the full example instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, makes sense. This was observed here https://github.com/iree-org/iree/actions/runs/20449877963/job/58761041931#step:8:166 but I will look more into it today.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After digging into it more, I think the root issue is from the most recent integrate #22943, likely involving the changes to iree-dispatch-creation-fold-unit-extent-dims from #22921. In particular, unit dims are not being properly folded away during GlobalOptimization. I think it didn’t surface until after the dimension expansion PR got merged because by luck we didn’t run into cases where this was a problem. Filed issue along with reproduction steps here: #22978. Closing this PR.
…rmutation maps (#23200) The `expand_dims` lowering config attribute relies on `linalg::populateFoldReshapeOpsByExpansionPatterns` to fuse `tensor.expand_shape`/`tensor.collapse_shape` into the linalg op in `GPUExpandDimensions`. This requires all indexing maps to be projected permutations. This patch adds a check to verify this before setting the attribute. (Originally opened in #22955, but we do want this after all). Fixes: #23185 --------- Signed-off-by: Eric Feng <[email protected]>
…rmutation maps (#23200) The `expand_dims` lowering config attribute relies on `linalg::populateFoldReshapeOpsByExpansionPatterns` to fuse `tensor.expand_shape`/`tensor.collapse_shape` into the linalg op in `GPUExpandDimensions`. This requires all indexing maps to be projected permutations. This patch adds a check to verify this before setting the attribute. (Originally opened in #22955, but we do want this after all). Fixes: #23185 --------- Signed-off-by: Eric Feng <[email protected]> Signed-off-by: Keshav Vinayak Jha <[email protected]>
The
expand_dimslowering config attribute relies onlinalg::populateFoldReshapeOpsByExpansionPatternsto fusetensor.expand_shape/tensor.collapse_shapeinto the linalg op inGPUExpandDimensions. This requires all indexing maps to be projected permutations. This patch adds a check to verify this before setting the attribute.Not sure why this wasn't caught in the PR CI but fixes compilation issue of
densenet-12in test_onnx_models::amdgpu_hip_rdna3.