Skip to content

Conversation

@hanhanW
Copy link
Contributor

@hanhanW hanhanW commented Jul 25, 2025

The revision adds an option to skip root op in LLVMCPUTile pass, and uses it in multi level tiling pipeline.

In softmax dispatch, there are two reduction ops. Only the root op is tiled for reduction dimensions when we switched to LLVMCPUTileRootAndFuseInputOperandsPass. It results in large vector sizes in the other reduction op when util.assume.hint ops are present.

We did not hit the issue in e2e tests because AnnotateDispatchAssumptions pass behaves differently. The value range is [0, 0] if the input is from flow.tensor.dynamic_constant.

Fixes #21359

@hanhanW hanhanW requested a review from egebeysel July 25, 2025 21:43
@hanhanW hanhanW requested a review from maxbartel July 25, 2025 21:51
Comment on lines 421 to 425
// Tile all the reduction ops for target vector sizes. It is a nop for
// rootOp because it is already tiled with the same tile sizes. It
// ensures that all the dimensions are tiled in all the reduction ops.
funcPassManager.addPass(
createLLVMCPUTilePass(static_cast<IREE::CPU::TilingLevel>(i)));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not NOP if the dimension size is not divisible by tile size, because it becomes dynamic shapes and the for loops can't be folded away.

I think we need to provide an option to LLVMCPUTilePass that allows only tiling on non-root ops.

Copy link
Contributor

@maxbartel maxbartel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @hanhanW! We will try this out on Monday and tell you if we still see some issues.

@hanhanW
Copy link
Contributor Author

hanhanW commented Jul 28, 2025

This starts depending on #21354 because we need a way to determine the root op. The needed patches will be propagated to IREE in #21510. Instead of adding tech debt, I'd like to wait for a couple of days and land this after the switch.

@hanhanW hanhanW force-pushed the issue-21359 branch 2 times, most recently from b37133e to cd80291 Compare July 28, 2025 23:16
hanhanW added 3 commits July 29, 2025 10:43
In softmax dispatch, there are two reduction ops. Only the root op is
tiled for reduction dimensions when we switched to
LLVMCPUTileRootAndFuseInputOperandsPass. It results in large vector
sizes in the other reduction op when `util.assume.hint` ops are present.

We did not hit the issue in e2e tests because
AnnotateDispatchAssumptions pass behaves differently. The value range is
[0, 0] if the input is from `flow.tensor.dynamic_constant`.

Fixes iree-org#21359

Signed-off-by: hanhanW <[email protected]>
@hanhanW hanhanW merged commit c59f2d4 into iree-org:main Jul 29, 2025
44 checks passed
@hanhanW hanhanW deleted the issue-21359 branch July 29, 2025 18:52
hhkit pushed a commit to opencompl/iree that referenced this pull request Aug 8, 2025
…#21500)

The revision adds an option to skip root op in LLVMCPUTile pass, and
uses it in multi level tiling pipeline.

In softmax dispatch, there are two reduction ops. Only the root op is
tiled for reduction dimensions when we switched to
LLVMCPUTileRootAndFuseInputOperandsPass. It results in large vector
sizes in the other reduction op when `util.assume.hint` ops are present.

We did not hit the issue in e2e tests because
AnnotateDispatchAssumptions pass behaves differently. The value range is
[0, 0] if the input is from `flow.tensor.dynamic_constant`.

Fixes iree-org#21359

---------

Signed-off-by: hanhanW <[email protected]>
keshavvinayak01 pushed a commit to keshavvinayak01/iree that referenced this pull request Sep 4, 2025
…#21500)

The revision adds an option to skip root op in LLVMCPUTile pass, and
uses it in multi level tiling pipeline.

In softmax dispatch, there are two reduction ops. Only the root op is
tiled for reduction dimensions when we switched to
LLVMCPUTileRootAndFuseInputOperandsPass. It results in large vector
sizes in the other reduction op when `util.assume.hint` ops are present.

We did not hit the issue in e2e tests because
AnnotateDispatchAssumptions pass behaves differently. The value range is
[0, 0] if the input is from `flow.tensor.dynamic_constant`.

Fixes iree-org#21359

---------

Signed-off-by: hanhanW <[email protected]>
Signed-off-by: keshavvinayak01 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Codegen] compilation fails because of vector size verification error

2 participants