[GPU] Inject index hints during MMA lane distribution#23152
Merged
Max191 merged 3 commits intoiree-org:mainfrom Jan 16, 2026
Merged
[GPU] Inject index hints during MMA lane distribution#23152Max191 merged 3 commits intoiree-org:mainfrom
Max191 merged 3 commits intoiree-org:mainfrom
Conversation
Wrap delinearized lane IDs with iree_codegen.index_hint ops during MMA operand distribution. This annotates indices with their lane-variance semantics, enabling downstream passes to recognize transpose load opportunities. When distributing MMA operands across lanes, the lane ID is delinearized into row and column components. The row component is uniform within 16-lane groups (lane_constant<16>), while the column component increments across consecutive lanes (lane_increment<16>). These hints propagate through index arithmetic and are later consumed by the transpose load lowering pass. This is the producer side of the transpose load optimization - hints are injected here and consumed by ROCDLLoadToTransposeLoadPass. Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
261bf8f to
91c92b4
Compare
Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
kuhar
reviewed
Jan 17, 2026
Comment on lines
+801
to
+802
| return SmallVector<Value>(delinearizedLaneId.begin(), | ||
| delinearizedLaneId.end()); |
Member
There was a problem hiding this comment.
prefer llvm::to_vector or llvm::to_vector_of<T>
Comment on lines
+814
to
+815
| for (size_t i = 1; i <= basis.size(); ++i) { | ||
| groupSize = basis[basis.size() - i]; |
Member
There was a problem hiding this comment.
Do not recalculate the end
Suggested change
| for (size_t i = 1; i <= basis.size(); ++i) { | |
| groupSize = basis[basis.size() - i]; | |
| for (size_t i = 1, e = basis.size(); i <= e; ++i) { | |
| groupSize = basis[e - i]; |
keshavvinayak01
pushed a commit
that referenced
this pull request
Jan 27, 2026
Injects iree_codegen.index_hint ops on offsets in the populateOperandOffsetsSizesStrides functions for MMAAttrs. We inject the hints here, because the semantic information about the offsets is readily available, and can easily carry down to the later optimization pass that converts loads into transpose loads using these hints. These hints are intended for load to transpose load optimizations, but they are set unconditionally regardless of transpositions for simplicity. The later optimization pass is responsible for determining when the loads are transposed, since it is more explicit at that point. The hint ops will be dropped right after LLVMGPULowerExecutableTarget, since at that point the index_hint ops should already have been used. Currently, the pass that consumes these hint ops is not enabled, so the hint ops will be doing nothing until the pass is added. --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com> Signed-off-by: Keshav Vinayak Jha <keshavvinayakjha@gmail.com>
MaheshRavishankar
pushed a commit
to MaheshRavishankar/iree
that referenced
this pull request
Feb 24, 2026
Injects iree_codegen.index_hint ops on offsets in the populateOperandOffsetsSizesStrides functions for MMAAttrs. We inject the hints here, because the semantic information about the offsets is readily available, and can easily carry down to the later optimization pass that converts loads into transpose loads using these hints. These hints are intended for load to transpose load optimizations, but they are set unconditionally regardless of transpositions for simplicity. The later optimization pass is responsible for determining when the loads are transposed, since it is more explicit at that point. The hint ops will be dropped right after LLVMGPULowerExecutableTarget, since at that point the index_hint ops should already have been used. Currently, the pass that consumes these hint ops is not enabled, so the hint ops will be doing nothing until the pass is added. --------- Signed-off-by: Max Dawkins <max.dawkins@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Injects iree_codegen.index_hint ops on offsets in the populateOperandOffsetsSizesStrides functions for MMAAttrs. We inject the hints here, because the semantic information about the offsets is readily available, and can easily carry down to the later optimization pass that converts loads into transpose loads using these hints. These hints are intended for load to transpose load optimizations, but they are set unconditionally regardless of transpositions for simplicity. The later optimization pass is responsible for determining when the loads are transposed, since it is more explicit at that point.
The hint ops will be dropped right after LLVMGPULowerExecutableTarget, since at that point the index_hint ops should already have been used. Currently, the pass that consumes these hint ops is not enabled, so the hint ops will be doing nothing until the pass is added.