Skip to content

Prevent cross-EP Cast fusion in RemoveDuplicateCastTransformer#27363

Merged
fs-eire merged 2 commits intomainfrom
fs-eire/fix-cast-fuse
Feb 20, 2026
Merged

Prevent cross-EP Cast fusion in RemoveDuplicateCastTransformer#27363
fs-eire merged 2 commits intomainfrom
fs-eire/fix-cast-fuse

Conversation

@fs-eire
Copy link
Copy Markdown
Contributor

@fs-eire fs-eire commented Feb 16, 2026

Description

Fixes an incorrect Cast deduplication across execution provider boundaries.
Today, Cast(int64->float, CPU) -> Cast(float->float16, WebGPU) can be fused into Cast(int64->float16, WebGPU), which is invalid for WebGPU and can fail kernel lookup.

This change adds an EP check so Cast fusion only happens when both nodes are on the same EP, and adds a regression test for this scenario.

Motivation and Context

Fixes #27291

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes an incorrect Cast node fusion across execution provider (EP) boundaries in the RemoveDuplicateCastTransformer. The bug allowed consecutive Cast nodes assigned to different EPs to be fused, resulting in a node with an input type unsupported by its EP (e.g., Cast(int64->float, CPU) -> Cast(float->float16, WebGPU) being fused into Cast(int64->float16, WebGPU), which fails because WebGPU doesn't support int64 inputs).

Changes:

  • Added EP boundary check in RemoveDuplicateCastTransformer to prevent cross-EP Cast fusion
  • Added regression test verifying Cast nodes on different EPs are not fused

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
onnxruntime/core/optimizer/insert_cast_transformer.cc Added cross-EP check before fusing Cast nodes, with detailed comments explaining the issue and solution
onnxruntime/test/framework/insert_cast_transformer_test.cc Added regression test that creates CPU->WebGPU Cast chain and verifies they remain separate after transformation

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@guschmue guschmue added the ep:WebGPU ort-web webgpu provider label Feb 20, 2026
@fs-eire fs-eire merged commit e27c1ed into main Feb 20, 2026
96 checks passed
@fs-eire fs-eire deleted the fs-eire/fix-cast-fuse branch February 20, 2026 23:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ep:WebGPU ort-web webgpu provider

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[WebGPU] Failed to find kernel for Cast(13) for WebGpuExecutionProvider

3 participants