Skip to content

[Codegen][MXFP4] Add folding patterns for tensor.empty op that can bypass SwizzleHintOps#23084

Merged
Muzammiluddin-Syed-ECE merged 11 commits intoiree-org:mainfrom
Muzammiluddin-Syed-ECE:muzasyed/fold
Jan 16, 2026
Merged

[Codegen][MXFP4] Add folding patterns for tensor.empty op that can bypass SwizzleHintOps#23084
Muzammiluddin-Syed-ECE merged 11 commits intoiree-org:mainfrom
Muzammiluddin-Syed-ECE:muzasyed/fold

Conversation

@Muzammiluddin-Syed-ECE
Copy link
Copy Markdown
Contributor

This is the second of a series of PRs that together implement support in IREE for XOR swizzling through the SwizzleHintOp.

There are four PRs that need to be merged:

  1. Allow rank > 1 swizzle hint op operands and add a pass to flatten swizzle hint allocs.
  2. Add patterns which can fold reshapes and extract_slice ops into empty ops through swizzle hint ops.
  3. Add swizzle hint attribute to be set in lowering_config and consumed in GPUPromoteMatmulOperandsPass.
  4. Update LLVMGPUSelectLoweringStrategy Pass to set xor swizzles for MXFP4 GEMMs.

This is PR 2, which does two things:

  • duplicates folding patterns for tensor.empty op from upstream llvm-project in IREE, but with support for swizzle hint ops.
  • Adds these patterns to the GPUApplyTilingPass.

@Muzammiluddin-Syed-ECE
Copy link
Copy Markdown
Contributor Author

Note we can eventually upstream this change to llvm-project once the SwizzleHintOp is more widely used.

Copy link
Copy Markdown
Contributor

@krzysz00 krzysz00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see anything wrong here but I'd like to wait for more comments

Copy link
Copy Markdown
Member

@kuhar kuhar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you also add tests for the changes in gpu apply tiling level?

Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
@Muzammiluddin-Syed-ECE Muzammiluddin-Syed-ECE merged commit a9a7b41 into iree-org:main Jan 16, 2026
52 of 54 checks passed
keshavvinayak01 pushed a commit that referenced this pull request Jan 27, 2026
…pass SwizzleHintOps (#23084)

This is the second of a series of PRs that together implement support in
IREE for XOR swizzling through the SwizzleHintOp.

There are four PRs that need to be merged:
1) Allow rank > 1 swizzle hint op operands and add a pass to flatten
swizzle hint allocs.
2) Add patterns which can fold reshapes and `extract_slice` ops into
empty ops through swizzle hint ops.
3) Add swizzle hint attribute to be set in `lowering_config` and
consumed in `GPUPromoteMatmulOperandsPass`.
4) Update `LLVMGPUSelectLoweringStrategy` Pass to set xor swizzles for
MXFP4 GEMMs.

This is PR 2, which does two things:
- duplicates folding patterns for tensor.empty op from upstream
llvm-project in IREE, but with support for swizzle hint ops.
- Adds these patterns to the `GPUApplyTilingPass`.

---------

Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Signed-off-by: Keshav Vinayak Jha <keshavvinayakjha@gmail.com>
MaheshRavishankar pushed a commit to MaheshRavishankar/iree that referenced this pull request Feb 24, 2026
…pass SwizzleHintOps (iree-org#23084)

This is the second of a series of PRs that together implement support in
IREE for XOR swizzling through the SwizzleHintOp.

There are four PRs that need to be merged:
1) Allow rank > 1 swizzle hint op operands and add a pass to flatten
swizzle hint allocs.
2) Add patterns which can fold reshapes and `extract_slice` ops into
empty ops through swizzle hint ops.
3) Add swizzle hint attribute to be set in `lowering_config` and
consumed in `GPUPromoteMatmulOperandsPass`.
4) Update `LLVMGPUSelectLoweringStrategy` Pass to set xor swizzles for
MXFP4 GEMMs.

This is PR 2, which does two things:
- duplicates folding patterns for tensor.empty op from upstream
llvm-project in IREE, but with support for swizzle hint ops.
- Adds these patterns to the `GPUApplyTilingPass`.

---------

Signed-off-by: Muzammiluddin Syed <muzasyed@amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants