[CPU] Tile reduction dimensions for non-root reduction ops. #21500

hanhanW · 2025-07-25T21:43:57Z

The revision adds an option to skip root op in LLVMCPUTile pass, and uses it in multi level tiling pipeline.

In softmax dispatch, there are two reduction ops. Only the root op is tiled for reduction dimensions when we switched to LLVMCPUTileRootAndFuseInputOperandsPass. It results in large vector sizes in the other reduction op when util.assume.hint ops are present.

We did not hit the issue in e2e tests because AnnotateDispatchAssumptions pass behaves differently. The value range is [0, 0] if the input is from flow.tensor.dynamic_constant.

Fixes #21359

hanhanW · 2025-07-25T23:45:38Z

compiler/src/iree/compiler/Codegen/LLVMCPU/Passes.cpp

+        // Tile all the reduction ops for target vector sizes. It is a nop for
+        // rootOp because it is already tiled with the same tile sizes. It
+        // ensures that all the dimensions are tiled in all the reduction ops.
+        funcPassManager.addPass(
+            createLLVMCPUTilePass(static_cast<IREE::CPU::TilingLevel>(i)));


It is not NOP if the dimension size is not divisible by tile size, because it becomes dynamic shapes and the for loops can't be folded away.

I think we need to provide an option to LLVMCPUTilePass that allows only tiling on non-root ops.

maxbartel

Thanks @hanhanW! We will try this out on Monday and tell you if we still see some issues.

hanhanW · 2025-07-28T19:42:02Z

This starts depending on #21354 because we need a way to determine the root op. The needed patches will be propagated to IREE in #21510. Instead of adding tech debt, I'd like to wait for a couple of days and land this after the switch.

In softmax dispatch, there are two reduction ops. Only the root op is tiled for reduction dimensions when we switched to LLVMCPUTileRootAndFuseInputOperandsPass. It results in large vector sizes in the other reduction op when `util.assume.hint` ops are present. We did not hit the issue in e2e tests because AnnotateDispatchAssumptions pass behaves differently. The value range is [0, 0] if the input is from `flow.tensor.dynamic_constant`. Fixes iree-org#21359 Signed-off-by: hanhanW <[email protected]>

Signed-off-by: hanhanW <[email protected]>

…generic` Signed-off-by: hanhanW <[email protected]>

…#21500) The revision adds an option to skip root op in LLVMCPUTile pass, and uses it in multi level tiling pipeline. In softmax dispatch, there are two reduction ops. Only the root op is tiled for reduction dimensions when we switched to LLVMCPUTileRootAndFuseInputOperandsPass. It results in large vector sizes in the other reduction op when `util.assume.hint` ops are present. We did not hit the issue in e2e tests because AnnotateDispatchAssumptions pass behaves differently. The value range is [0, 0] if the input is from `flow.tensor.dynamic_constant`. Fixes iree-org#21359 --------- Signed-off-by: hanhanW <[email protected]>

…#21500) The revision adds an option to skip root op in LLVMCPUTile pass, and uses it in multi level tiling pipeline. In softmax dispatch, there are two reduction ops. Only the root op is tiled for reduction dimensions when we switched to LLVMCPUTileRootAndFuseInputOperandsPass. It results in large vector sizes in the other reduction op when `util.assume.hint` ops are present. We did not hit the issue in e2e tests because AnnotateDispatchAssumptions pass behaves differently. The value range is [0, 0] if the input is from `flow.tensor.dynamic_constant`. Fixes iree-org#21359 --------- Signed-off-by: hanhanW <[email protected]> Signed-off-by: keshavvinayak01 <[email protected]>

hanhanW requested a review from egebeysel July 25, 2025 21:43

hanhanW requested review from MaheshRavishankar and pashu123 as code owners July 25, 2025 21:43

hanhanW requested a review from maxbartel July 25, 2025 21:51

hanhanW commented Jul 25, 2025

View reviewed changes

maxbartel approved these changes Jul 26, 2025

View reviewed changes

hanhanW force-pushed the issue-21359 branch from f4e9fad to e14325c Compare July 28, 2025 17:11

hanhanW force-pushed the issue-21359 branch 2 times, most recently from b37133e to cd80291 Compare July 28, 2025 23:16

hanhanW added 3 commits July 29, 2025 10:43

Add an option for skipping root op in LLVMCPUTile pass.

1243e79

Signed-off-by: hanhanW <[email protected]>

Drop linalg CHECK-NOT because there is a copy op that uses `linalg.…

e97de51

…generic` Signed-off-by: hanhanW <[email protected]>

hanhanW force-pushed the issue-21359 branch from cd80291 to e97de51 Compare July 29, 2025 17:43

hanhanW merged commit c59f2d4 into iree-org:main Jul 29, 2025
44 checks passed

hanhanW deleted the issue-21359 branch July 29, 2025 18:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[CPU] Tile reduction dimensions for non-root reduction ops. #21500

[CPU] Tile reduction dimensions for non-root reduction ops. #21500

Uh oh!

hanhanW commented Jul 25, 2025 •

edited

Loading

Uh oh!

hanhanW Jul 25, 2025

Uh oh!

maxbartel left a comment

Uh oh!

hanhanW commented Jul 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

[CPU] Tile reduction dimensions for non-root reduction ops. #21500

[CPU] Tile reduction dimensions for non-root reduction ops. #21500

Uh oh!

Conversation

hanhanW commented Jul 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

hanhanW Jul 25, 2025

Choose a reason for hiding this comment

Uh oh!

maxbartel left a comment

Choose a reason for hiding this comment

Uh oh!

hanhanW commented Jul 28, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

hanhanW commented Jul 25, 2025 •

edited

Loading