Skip to content

Reapply "[GPU] Allow multi result and indexing compute generic ops in TilleAndFuse pipeline" (#22205)"#22223

Merged
nirvedhmeshram merged 4 commits into
iree-org:mainfrom
nirvedhmeshram:multi_result_indexing_taf
Oct 10, 2025
Merged

Reapply "[GPU] Allow multi result and indexing compute generic ops in TilleAndFuse pipeline" (#22205)"#22223
nirvedhmeshram merged 4 commits into
iree-org:mainfrom
nirvedhmeshram:multi_result_indexing_taf

Conversation

@nirvedhmeshram
Copy link
Copy Markdown
Contributor

@nirvedhmeshram nirvedhmeshram commented Oct 6, 2025

Supporting this cases exposed an unrelated issue with broadcasting dims from a producer of the consumer op which is now fixed in this version of the PR. We wont distibute the broadcasted dims as the producer cant be fused in that case.

Fixes : #22204
Fixes : #22175

Copy link
Copy Markdown
Contributor

@qedawkins qedawkins left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a stronger check that the broadcasted producer is returned from the executable. Then otherwise this looks good.

Comment thread compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp Outdated
Comment thread compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp Outdated
… TilleAndFuse pipeline" (iree-org#22205)"

We now have a fix for when the compute op is broadcasted and there is producer to it without the
broadcasted dim that was causing an unrelated failure with the conditions we are relaxing in this
Pr.

Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>
Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>
Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>
Comment thread compiler/src/iree/compiler/Codegen/Dialect/GPU/TargetUtils/ConfigUtils.cpp Outdated
@nirvedhmeshram nirvedhmeshram force-pushed the multi_result_indexing_taf branch from 177288d to 4ed0164 Compare October 10, 2025 17:37
@nirvedhmeshram nirvedhmeshram enabled auto-merge (squash) October 10, 2025 17:37
Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>
@nirvedhmeshram nirvedhmeshram merged commit 6701262 into iree-org:main Oct 10, 2025
45 checks passed
@pdhirajkumarprasad
Copy link
Copy Markdown

We have regression in compile for llama model due to this change.

command:

iree-compile output.mlir --iree-hip-target=gfx942 -o tt.vmfb --iree-hal-target-device=hip --iree-opt-level=O3 --iree-hal-indirect-command-buffers=true --iree-stream-resource-memory-model=discrete --iree-hip-enable-tensor-ukernels --iree-stream-affinity-solver-max-iterations=1024 --iree-hal-memoization=true --iree-codegen-enable-default-tuning-specs=true --iree-llvmgpu-test-combine-layout-transformation=false

output.mlir.txt

@IanWood1
Copy link
Copy Markdown
Member

@pdhirajkumarprasad which model variant is regressing?

@Groverkss
Copy link
Copy Markdown
Contributor

Groverkss commented Oct 13, 2025

@MaheshRavishankar
Copy link
Copy Markdown
Collaborator

We have regression in compile for llama model due to this change.

command:

iree-compile output.mlir --iree-hip-target=gfx942 -o tt.vmfb --iree-hal-target-device=hip --iree-opt-level=O3 --iree-hal-indirect-command-buffers=true --iree-stream-resource-memory-model=discrete --iree-hip-enable-tensor-ukernels --iree-stream-affinity-solver-max-iterations=1024 --iree-hal-memoization=true --iree-codegen-enable-default-tuning-specs=true --iree-llvmgpu-test-combine-layout-transformation=false

output.mlir.txt

@pdhirajkumarprasad We have to revert a few changes if we need to revert. A little more information about the failure would help..

@nirvedhmeshram / @IanWood1 can you see if ToM is failing with this?

@IanWood1
Copy link
Copy Markdown
Member

Here are the failing executable sources from ToM(4efac5b): https://gist.github.com/IanWood1/9a6ee98c76a8ef186071a69c539fcefa

To reproduce use iree-compile source.mlir -o /dev/null --compile-from=executable-sources --iree-llvmgpu-test-combine-layout-transformation=false on any of the sources.

--iree-llvmgpu-test-combine-layout-transformation=false seems to be related. Removing it fixes the compilation error. Why are we using a "test" flag?

@nirvedhmeshram
Copy link
Copy Markdown
Contributor Author

Here are the failing executable sources from ToM(4efac5b): https://gist.github.com/IanWood1/9a6ee98c76a8ef186071a69c539fcefa

To reproduce use iree-compile source.mlir -o /dev/null --compile-from=executable-sources --iree-llvmgpu-test-combine-layout-transformation=false on any of the sources.

--iree-llvmgpu-test-combine-layout-transformation=false seems to be related. Removing it fixes the compilation error. Why are we using a "test" flag?

ok so the problem is without that flag we end up with iree_tensor_ext.dispatch.tensor.store but now our codegen is set up for iree_codegen.store_to_buffer @Max191 is it ok to not use this flag? would like to understand the context why model people are using it..

@nirvedhmeshram
Copy link
Copy Markdown
Contributor Author

ok I have a fix over here #22291

nirvedhmeshram added a commit that referenced this pull request Oct 13, 2025
…ducer (#22291)

#22223 only checked for
iree_codegen.store_to_buffer

Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>
weidel-p pushed a commit to weidel-p/iree that referenced this pull request Oct 21, 2025
… TilleAndFuse pipeline" (iree-org#22205)" (iree-org#22223)

Supporting this cases exposed an unrelated issue with broadcasting dims
from a producer of the consumer op which is now fixed in this version of
the PR. We wont distibute the broadcasted dims as the producer cant be
fused in that case.

Fixes : iree-org#22204
Fixes : iree-org#22175

---------

Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>
Signed-off-by: Philipp <philipp.weidel@intel.com>
weidel-p pushed a commit to weidel-p/iree that referenced this pull request Oct 21, 2025
…ducer (iree-org#22291)

iree-org#22223 only checked for
iree_codegen.store_to_buffer

Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>
Signed-off-by: Philipp <philipp.weidel@intel.com>
mischirmer pushed a commit to mischirmer/iree that referenced this pull request Nov 24, 2025
…ducer (#22291)

iree-org/iree#22223 only checked for
iree_codegen.store_to_buffer

Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>
pstarkcdpr pushed a commit to pstarkcdpr/iree that referenced this pull request Nov 28, 2025
… TilleAndFuse pipeline" (iree-org#22205)" (iree-org#22223)

Supporting this cases exposed an unrelated issue with broadcasting dims
from a producer of the consumer op which is now fixed in this version of
the PR. We wont distibute the broadcasted dims as the producer cant be
fused in that case.

Fixes : iree-org#22204
Fixes : iree-org#22175

---------

Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>
pstarkcdpr pushed a commit to pstarkcdpr/iree that referenced this pull request Nov 28, 2025
…ducer (iree-org#22291)

iree-org#22223 only checked for
iree_codegen.store_to_buffer

Signed-off-by: Nirvedh Meshram <nirvedh@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[GPU] Llama 70b fp8 compilation regression [GPU] RoPE + Scatter dispatch producing incorrect results

6 participants