Skip to content

Modify scale & offset of WhereDummyDq#27109

Merged
edgchen1 merged 6 commits into
microsoft:mainfrom
CodeLinaro:dev/qti-hungjuiw/where-acc
Apr 10, 2026
Merged

Modify scale & offset of WhereDummyDq#27109
edgchen1 merged 6 commits into
microsoft:mainfrom
CodeLinaro:dev/qti-hungjuiw/where-acc

Conversation

@qti-hungjuiw
Copy link
Copy Markdown
Contributor

Description

  • Update WhereDummyDq QDQ transformer to be more selective before inserting a dummy DequantizeLinear around Where.
    • SatisfyCondition now requires the Where output to have exactly one consumer and that consumer must be QuantizeLinear (Q). Otherwise, the transform is skipped.
    • InsertDummyDQ additionally checks element type consistency between the upstream DQ input tensor type and the downstream Q output tensor type; if they differ, the transform returns without modifying the graph.
  • Update the implementation of WhereDummyDq to avoid negative or zero scale value. The change maps the float value to the boundary of integer domain to ensure the scale value is positive.
    • If WhereOp get a float scalar xf and a DequantizeLinear as its two inputs, WhereDummyDq insert DQ to ensure xf = DQ(xq, scale, zp)

    • The xq, scale and zp are determined with the following table.

      uint8 uint16 int8 int16
      xf > 0
      xq 255 65535 127 32767
      zp 127 32767 0 0
      xf < 0
      xq 0 0 -128 -32768
      zp 127 32767 0 0
      xf = 0
      xq 127 32767 0 0
      zp 127 32767 0 0
    • scale = xf / (xq - zp) if xq != zp else 1

Motivation and Context

  • Negative or zero scale value is not friendly for various EP and backend such as QNN-EP.
  • Inserting an additional DQ is only useful when it forms a valid QDQ “node unit” pattern. If the Where output is not followed by a single QuantizeLinear (e.g., multiple consumers or a non-Q consumer), adding a dummy DQ cannot create the intended pattern and may lead to non-fusible/undesired graph structures.

@qti-hungjuiw
Copy link
Copy Markdown
Contributor Author

Hi @edgchen1, this fix is for a critical model. Could you help trigger the CI and review the PR?

Comment thread onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.cc
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the WhereDummyDq QDQ transformer to be more selective about when it inserts a dummy DequantizeLinear on Where scalar inputs, and adjusts the dummy quantization parameters to avoid non-positive scales (improving compatibility with EPs/backends).

Changes:

  • Tighten WhereDummyDq::SatisfyCondition to only apply when Where is immediately followed by a single QuantizeLinear.
  • Update dummy (xq, zp, scale) construction so computed scale is positive by mapping the float scalar to integer-domain boundaries.
  • Fix the unit test to use the correct zero-point type for the downstream QuantizeLinear.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 5 comments.

File Description
onnxruntime/test/optimizer/qdq_transformer_test.cc Adjusts WhereDummyDq test model to use ZpType for QuantizeLinear zero-point so types match the test’s quantized dtype.
onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.h Expands class documentation to describe the dummy quantization parameter selection and scale computation.
onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.cc Implements stricter pattern checks and new dummy scale/zp/xq selection to keep dummy scale positive.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.cc
Comment thread onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.cc Outdated
Comment thread onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.cc Outdated
Comment thread onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.cc
Comment thread onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.cc
Comment thread onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.cc
@qti-hungjuiw qti-hungjuiw force-pushed the dev/qti-hungjuiw/where-acc branch from 0f4e785 to d8dc6a4 Compare March 23, 2026 10:55
@adrianlizarraga
Copy link
Copy Markdown
Contributor

adrianlizarraga commented Apr 1, 2026

Hi @qti-hungjuiw,

I think there are some merge conflicts with main. May have to sync your branch with latest main.

Also, there are still a couple copilot review comments that are unresolved. Could you please take a look at those? It looks like some of those may be valid.

Thank you for fixing this!

@qti-hungjuiw qti-hungjuiw force-pushed the dev/qti-hungjuiw/where-acc branch from d8dc6a4 to 5f9e4ee Compare April 7, 2026 08:20
- Update WhereDummyDq QDQ transformer to be more selective before
inserting a dummy DequantizeLinear around Where.
    - SatisfyCondition now requires the Where output to have exactly one
consumer and that consumer must be QuantizeLinear (Q). Otherwise, the
transform is skipped.
    - InsertDummyDQ additionally checks element type consistency between the
upstream DQ input tensor type and the downstream Q output tensor type;
if they differ, the transform returns without modifying the graph.

- Update the implementation of WhereDummyDq to avoid negative or zero
scale value. The change maps the float value to the boundary of integer
domain to ensure the scale value is positive.
- e.g. DQ uses uint8 but Q uses uint16
- Use HasTensorOrScalarShape to check the shape in SatisfyCondition
- Check the return value of GetInitializedTensor in InsertDummyDQ
@qti-hungjuiw qti-hungjuiw force-pushed the dev/qti-hungjuiw/where-acc branch from 5f9e4ee to 2e5279b Compare April 7, 2026 09:32
@qti-hungjuiw
Copy link
Copy Markdown
Contributor Author

Hi @qti-hungjuiw,

I think there are some merge conflicts with main. May have to sync your branch with latest main.

Also, there are still a couple copilot review comments that are unresolved. Could you please take a look at those? It looks like some of those may be valid.

Thank you for fixing this!

Hi @adrianlizarraga, sure, I've resolved the conflict and addressed copilot's comments. Please let me know if there are any issues.

Comment thread onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.cc Outdated
Comment thread onnxruntime/core/optimizer/qdq_transformer/where_dummy_dq.h
@edgchen1
Copy link
Copy Markdown
Contributor

edgchen1 commented Apr 8, 2026

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline

@azure-pipelines
Copy link
Copy Markdown

Azure Pipelines successfully started running 4 pipeline(s).

@edgchen1 edgchen1 merged commit ce91376 into microsoft:main Apr 10, 2026
103 of 106 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants