Skip to content

Conversation

@qti-hungjuiw
Copy link
Contributor

Description

  • Add a GraphTransformer WhereDummyDq to insert dummy DequantizeLinear on Where node's initializer input to form a Node Unit when Where node has one DQ and one scalar initializer input
  • Add corresponding unit test for the optimization

Motivation and Context

  • To reduce the additional Dequantize and Quantize nodes, we would like to pass WhereNodeGroupSelector::Check.

- Add a GraphTransformer WhereDummyDq to insert dummy DequantizeLinear on
Where node's initializer input to form a Node Unit when Where node has
one DQ and one scalar initializer input
- Add corresponding unit test for the optimization
@qti-hungjuiw
Copy link
Contributor Author

@microsoft-github-policy-service agree company="Qualcomm"

@qti-hungjuiw
Copy link
Contributor Author

Hi, the PR aims to improve the performance of QDQ models by removing additional QDQ nodes around where operator.

  • The current use case targets the QNN EP. Please let me know if any modification is required, such as adding compatible_execution_providers.

@HectorSVC HectorSVC requested a review from Copilot July 30, 2025 17:37
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR adds a new WhereDummyDq graph transformer to insert dummy DequantizeLinear nodes on Where node's initializer inputs to form Node Units when the Where node has one DQ input and one scalar initializer input. This optimization helps reduce additional Dequantize and Quantize nodes by enabling the WhereNodeGroupSelector::Check to pass.

Key changes:

  • Added new WhereDummyDq transformer class that identifies Where nodes with mixed DQ and scalar inputs
  • Implemented logic to insert dummy DQ nodes with appropriate scale/zero-point values derived from existing DQ nodes
  • Integrated the transformer into the optimization pipeline at Level1

Reviewed Changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
where_dummy_dq.h Header file defining the WhereDummyDq transformer class interface
where_dummy_dq.cc Implementation of the transformer logic including condition checking and dummy DQ insertion
graph_transformer_utils.cc Integration of WhereDummyDq transformer into the Level1 optimization pipeline
qdq_transformer_test.cc Unit tests covering various scenarios for the WhereDummyDq transformer

@HectorSVC
Copy link
Contributor

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

@azure-pipelines
Copy link

Azure Pipelines successfully started running 5 pipeline(s).

Copy link
Contributor

@HectorSVC HectorSVC left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:shipit:

@HectorSVC HectorSVC merged commit eade5fe into microsoft:main Jul 31, 2025
90 of 93 checks passed
sanketkaleoss pushed a commit to sanketkaleoss/onnxruntime that referenced this pull request Aug 11, 2025
### Description
- Add a GraphTransformer `WhereDummyDq` to insert dummy DequantizeLinear on Where node's initializer input to form a Node Unit when Where node has one DQ and one scalar initializer input
- Add corresponding unit test for the optimization

### Motivation and Context
- To reduce the additional Dequantize and Quantize nodes, we would like to pass `WhereNodeGroupSelector::Check`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants