Add QDQ scale propagation pass #713

javier-intel · 2025-06-16T23:38:30Z

Description

Adding pass to propagate scale values with a magnitude above a certain threshold to avoid numerical overflows.
https://jira.devtools.intel.com/browse/CVS-170179

Motivation and Context

Improve precision on certain networks

preetha-intel · 2025-06-18T16:30:04Z

onnxruntime/core/providers/openvino/backend_manager.cc

+  } else if (session_context_.device_type.find("GPU") != std::string::npos) {
+    // Create a copy of the model
+    std::unique_ptr<onnxruntime::Model> model;
+    Status status = qdq_scales_fix::Transform(subgraph, logger, model);


Is this pass happening even for non quantized models?

@preetha-intel, this pass is happening only when the enable_qdq_optimizer flag is set.
Inside the pass it specifically looks for quantized blocks with (u)int16 precision and ignores everything else. So the regular models are not affected by it, even if the flag was passed by accident

harihs1729 · 2025-07-02T17:45:39Z

Accuracy results from PSD model testing on GPU align with both NPU outputs and Microsoft’s expectations. Please proceed with merging this PR.

…des and duplicate DQ nodes

Copilot

Pull Request Overview

Adds a new pass to propagate and adjust quantization scales in QDQ (Quantize–Dequantize) pairs to avoid numerical overflow on large scale values. Key changes include:

Introduction of the qdq_scales_fix transformation pass (header, implementation, and protobuf utilities)
Invocation of the new scale‐propagation pass in backend_manager.cc for GPU when the QDQ optimizer is enabled
Build updates to link the ONNX protobuf definitions (onnx_proto) into the OpenVINO provider

Reviewed Changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated no comments.

Show a summary per file

File	Description
onnxruntime/core/providers/openvino/qdq_transformations/qdq_scales_fix.h	Declares the new `Transform` pass interface
onnxruntime/core/providers/openvino/qdq_transformations/qdq_scales_fix.cpp	Implements graph construction, scale propagation, and removal of QDQ pairs
onnxruntime/core/providers/openvino/ov_protobuf_utils.h	Adds helpers to get/set float data in protobuf tensors
onnxruntime/core/providers/openvino/ov_protobuf_utils.cpp	Defines `get_float_initializer_data` and `set_float_initializer_data`
onnxruntime/core/providers/openvino/backend_manager.cc	Calls the new scale‐fix pass for GPU under the QDQ optimizer
cmake/onnxruntime_providers_openvino.cmake	Links `onnx_proto` to the OpenVINO provider target
onnxruntime/core/optimizer/double_qdq_pairs_remover.cc	Fixes missing dimension in newly created initializer

Comments suppressed due to low confidence (4)

onnxruntime/core/providers/openvino/backend_manager.cc:433

The modified branching drops the so_share_ep_contexts condition for GPU in the first branch and only checks enable_ovep_qdq_optimizer in the new GPU branch. Please verify that GPU behavior when so_share_ep_contexts is true but the optimizer flag is false is still correct.

  if ((session_context_.device_type.find("NPU") != std::string::npos) &&

onnxruntime/core/providers/openvino/qdq_transformations/qdq_scales_fix.h:14

[nitpick] Add unit tests for the new Transform pass to cover scenarios with varying thresholds, different network topologies, and multiple QDQ patterns to guard against regressions.

Status Transform(const GraphViewer& src_graph,

onnxruntime/core/providers/openvino/qdq_transformations/qdq_scales_fix.cpp:14

The code uses std::format in ToString(), but is not included. Add #include <format> to ensure compilation succeeds.

#include <algorithm>

onnxruntime/core/providers/openvino/qdq_transformations/qdq_scales_fix.cpp:67

[nitpick] Remove these commented-out placeholder lines (//** node_input_name = [], //** node_output_name = []) to clean up dead code and improve readability.

      //** node_input_name = []

sfatimar

This PR is being merged only because of urgency.. It is not properly code reviewed so may have some issues.

* Add pass to perform QDQ stripping and propagate scales * Fix disconnected outptu node * Fixes to support session.disable_quant_qdq output, remove dangling nodes and duplicate DQ nodes * Fix lack of scales updates and remove stray QDQ nodes in certain models * Address issues with Linux CI * Fix for double QDQ issue

javier-intel requested a review from preetha-intel June 16, 2025 23:40

javier-intel force-pushed the jemartin/scale_propagation branch from 3d0ca12 to 4cb9374 Compare June 17, 2025 15:59

preetha-intel reviewed Jun 18, 2025

View reviewed changes

javier-intel requested a review from MayureshV1 June 24, 2025 16:37

javier-intel added 6 commits July 2, 2025 11:14

Add pass to perform QDQ stripping and propagate scales

05fcdce

Fix disconnected outptu node

19a5e7b

Fixes to support session.disable_quant_qdq output, remove dangling no…

66587aa

…des and duplicate DQ nodes

Fix lack of scales updates and remove stray QDQ nodes in certain models

304a8d2

Address issues with Linux CI

ce35466

Fix for double QDQ issue

e0cc75c

javier-intel force-pushed the jemartin/scale_propagation branch from 0281273 to e0cc75c Compare July 3, 2025 04:08

ankitm3k requested a review from Copilot July 3, 2025 09:45

Copilot AI reviewed Jul 3, 2025

View reviewed changes

sfatimar approved these changes Jul 3, 2025

View reviewed changes

sfatimar merged commit e2ec2b3 into ovep-develop Jul 3, 2025
6 of 8 checks passed

ankitm3k added a commit that referenced this pull request Jul 9, 2025

removed PR #713 changes

2a739ca

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add QDQ scale propagation pass #713

Add QDQ scale propagation pass #713

Uh oh!

javier-intel commented Jun 16, 2025 •

edited by sfatimar

Loading

Uh oh!

preetha-intel Jun 18, 2025

Uh oh!

mklimenk Jul 3, 2025 •

edited

Loading

Uh oh!

harihs1729 commented Jul 2, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

sfatimar left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

Add QDQ scale propagation pass #713

Add QDQ scale propagation pass #713

Uh oh!

Conversation

javier-intel commented Jun 16, 2025 • edited by sfatimar Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Motivation and Context

Uh oh!

preetha-intel Jun 18, 2025

Choose a reason for hiding this comment

Uh oh!

mklimenk Jul 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

harihs1729 commented Jul 2, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

sfatimar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

javier-intel commented Jun 16, 2025 •

edited by sfatimar

Loading

mklimenk Jul 3, 2025 •

edited

Loading