[QNN EP] Enable offline x64 compilation with memhandle IO type by derdeljan-msft · Pull Request #27479 · microsoft/onnxruntime

derdeljan-msft · 2026-02-27T09:17:36Z

Description

Enable offline compilation for QNN EP with MEMHADNLE IO type. It was previously enabled only on ARM because the QNN EP was loading rpcmem library (only available on ARM drivers), which is not actually used for the compilation (it is required only for inference to allocate the shared memory).

Ensured that the MEMHANDLE IO type is correctly set regardless of the way how QnnTensorWrapper is created (either through factory function or by creating it directly). This ensures that mem type will be correctly configured regardless of the op builder implementation.

(cherry picked from commit 4970a4c)

… comment

yuslepukhin · 2026-03-06T19:27:54Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU CUDA CI Pipeline, Windows GPU DML CI Pipeline, Windows GPU Doc Gen CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows OpenVINO CI Pipeline, Windows x64 QNN CI Pipeline

azure-pipelines · 2026-03-06T19:28:12Z

Azure Pipelines successfully started running 4 pipeline(s).

yuslepukhin

Need unit test.

onnxruntime/core/providers/qnn/builder/qnn_model_wrapper.cc

Copilot

Pull request overview

This PR enables offline x64 compilation for QNN EP with MEMHANDLE IO type by making two key changes: (1) deferring rpcmem library loading until inference time (since it's only needed for shared memory allocation, not compilation), and (2) centralizing the MEMHANDLE mem type assignment in AddTensorWrapper so that it's applied consistently regardless of how QnnTensorWrapper is created.

Changes:

Skip rpcmem library loading during context generation (when context_cache_enabled_ is true), enabling offline compilation on x64 where rpcmem is unavailable.
Move MEMHANDLE mem type assignment from MakeTensorWrapper to AddTensorWrapper, ensuring all tensor wrappers (whether created via factory methods or directly by op builders) get the correct mem type based on whether they are graph I/O tensors.
Update QnnTensorWrapper::Init to preserve the source tensor's mem type instead of unconditionally resetting to RAW.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated no comments.

File	Description
`onnxruntime/core/providers/qnn/qnn_execution_provider.cc`	Guard rpcmem library loading with `!context_cache_enabled_` to allow offline compilation without the library
`onnxruntime/core/providers/qnn/builder/qnn_model_wrapper.cc`	Remove mem type logic from `MakeTensorWrapper` and centralize it in `AddTensorWrapper`
`onnxruntime/core/providers/qnn/builder/qnn_def.h`	Preserve source tensor's mem type in `QnnTensorWrapper::Init` instead of resetting to RAW

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

github-actions

You can commit the suggested changes from lintrunner.

onnxruntime/core/providers/qnn/builder/qnn_model_wrapper.cc

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

yuslepukhin · 2026-03-06T22:33:19Z

/azp run Linux QNN CI Pipeline, Win_TRT_Minimal_CUDA_Test_CI, Windows ARM64 QNN CI Pipeline, Windows GPU CUDA CI Pipeline, Windows GPU DML CI Pipeline, Windows GPU Doc Gen CI Pipeline, Windows GPU TensorRT CI Pipeline, Windows OpenVINO CI Pipeline, Windows x64 QNN CI Pipeline

azure-pipelines · 2026-03-06T22:33:35Z

Azure Pipelines successfully started running 4 pipeline(s).

### Description Enable offline compilation for QNN EP with MEMHADNLE IO type. It was previously enabled only on ARM because the QNN EP was loading rpcmem library (only available on ARM drivers), which is not actually used for the compilation (it is required only for inference to allocate the shared memory). Ensured that the MEMHANDLE IO type is correctly set regardless of the way how `QnnTensorWrapper` is created (either through factory function or by creating it directly). This ensures that mem type will be correctly configured regardless of the op builder implementation. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> (cherry picked from commit d626b56)

### Description Enable offline compilation for QNN EP with MEMHADNLE IO type. It was previously enabled only on ARM because the QNN EP was loading rpcmem library (only available on ARM drivers), which is not actually used for the compilation (it is required only for inference to allocate the shared memory). Ensured that the MEMHANDLE IO type is correctly set regardless of the way how `QnnTensorWrapper` is created (either through factory function or by creating it directly). This ensures that mem type will be correctly configured regardless of the op builder implementation. --------- Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

This cherry-picks the following commits for the release: | Commit ID | PR Number | Commit Title | |-----------|-----------|-------------| | eb23be8 | #27354 | Update python_requires | | d626b56 | #27479 | [QNN EP] Enable offline x64 compilation with memhandle IO type | | 60ce0e6 | #27607 | Use `_tpause` instead of `__builtin_ia32_tpause` | | 69feb84 | #27591 | Add PCI bus fallback for Linux GPU device discovery in containerized environments | | de92668 | #27650 | Revert "[QNN EP] Fix error messages being logged as VERBOSE instead o… | | 0f66526 | #27644 | [Plugin EP] Check for nullptr before dereferencing | | 929f73e | #27666 | Plugin EP: Fix bug that incorrectly assigned duplicate MetDef IDs to fused nodes in different GraphViews | --------- Co-authored-by: XXXXRT666 <157766680+XXXXRT666@users.noreply.github.com> Co-authored-by: derdeljan-msft <derdeljan@microsoft.com> Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com> Co-authored-by: Shogo Yamazaki <f9ifphmiz7i8akhowc8l5t1x9qp0lfu4@mocknen.net> Co-authored-by: Copilot <198982749+Copilot@users.noreply.github.com> Co-authored-by: baijumeswani <12852605+baijumeswani@users.noreply.github.com> Co-authored-by: edgchen1 <18449977+edgchen1@users.noreply.github.com> Co-authored-by: Baiju Meswani <bmeswani@microsoft.com> Co-authored-by: Artur Wojcik <artur.wojcik@amd.com> Co-authored-by: Adrian Lizarraga <adlizarraga@microsoft.com>

[QNN EP] Enable offline x64 compilation with memhandle IO type

1957d10

(cherry picked from commit 4970a4c)

derdeljan-msft self-assigned this Feb 27, 2026

Indicate that rpc mem library is required for shared allocator in the…

4b232d2

… comment

derdeljan-msft requested a review from yuslepukhin March 6, 2026 00:23

yuslepukhin requested review from Copilot and edgchen1 March 6, 2026 19:29

Copilot started reviewing on behalf of yuslepukhin March 6, 2026 19:30 View session

yuslepukhin requested changes Mar 6, 2026

View reviewed changes

onnxruntime/core/providers/qnn/builder/qnn_model_wrapper.cc Outdated Show resolved Hide resolved

Copilot AI reviewed Mar 6, 2026

View reviewed changes

Add UT for memhandle type setting

00bf747

derdeljan-msft requested a review from yuslepukhin March 6, 2026 21:32

github-actions bot reviewed Mar 6, 2026

View reviewed changes

onnxruntime/core/providers/qnn/builder/qnn_model_wrapper.cc Outdated Show resolved Hide resolved

derdeljan-msft and others added 2 commits March 6, 2026 22:35

Update onnxruntime/core/providers/qnn/builder/qnn_model_wrapper.cc

9bfdd74

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

Fix test build

4765f8e

yuslepukhin approved these changes Mar 6, 2026

View reviewed changes

derdeljan-msft enabled auto-merge (squash) March 6, 2026 22:56

derdeljan-msft merged commit d626b56 into main Mar 7, 2026
91 checks passed

derdeljan-msft deleted the derdeljan/qnn_ep_memhandle_offline_compile branch March 7, 2026 07:53

tianleiwu added release:1.24.4 and removed release:1.24.4 labels Mar 10, 2026

devang-ml added the release:1.24.4 label Mar 10, 2026

tianleiwu mentioned this pull request Mar 16, 2026

ORT 1.24.4 release cherry pick round 1 #27682

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[QNN EP] Enable offline x64 compilation with memhandle IO type#27479

[QNN EP] Enable offline x64 compilation with memhandle IO type#27479
derdeljan-msft merged 5 commits intomainfrom
derdeljan/qnn_ep_memhandle_offline_compile

derdeljan-msft commented Feb 27, 2026

Uh oh!

yuslepukhin commented Mar 6, 2026

Uh oh!

azure-pipelines bot commented Mar 6, 2026

Uh oh!

yuslepukhin left a comment

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

github-actions bot left a comment

Uh oh!

Uh oh!

yuslepukhin commented Mar 6, 2026

Uh oh!

azure-pipelines bot commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants

Conversation

derdeljan-msft commented Feb 27, 2026

Description

Uh oh!

yuslepukhin commented Mar 6, 2026

Uh oh!

azure-pipelines bot commented Mar 6, 2026

Uh oh!

yuslepukhin left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

yuslepukhin commented Mar 6, 2026

Uh oh!

azure-pipelines bot commented Mar 6, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants