[NV TRT RTX EP] Leverage ORT allocator for workspace allocations #25564

gedoensmax · 2025-07-28T16:44:22Z

Description

This leverages the OrtAllocator for intermediate workspace required to execute the TRT engine. With this change we are able to significantly reduce memory usage for models with wide dynamic shape ranges as seen on ORT GenAI.

@jywu-msft @chilo-ms from our side reviews on this are done.

gaugarg-nv · 2025-07-31T10:45:05Z

@jywu-msft This PR is a blocker to run LLMs with TRT-RTX EP. Can you please merge this?

gaugarg-nv · 2025-07-31T10:47:39Z

@jywu-msft This PR is a blocker to run LLMs with TRT-RTX EP. Can you please merge this?

@jywu-msft We would also like to have this for WinML GA. Could you please help cherry-pick it in the right branch?

Copilot

Pull Request Overview

This PR leverages the ORT allocator for workspace allocations in the NVIDIA TensorRT RTX execution provider, significantly reducing memory usage for models with wide dynamic shape ranges. The change removes the previous context memory sharing mechanism and replaces it with dynamic allocation using ORT's allocator infrastructure.

Key changes include:

Removal of the context_memory_sharing_enable configuration option and related infrastructure
Implementation of dynamic context memory allocation using ORT allocator with per-context memory management
Addition of utility functions to detect dynamic shapes in TensorRT tensors

Reviewed Changes

Copilot reviewed 5 out of 5 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
nv_basic_test.cc	Updated test configuration and corrected model filename for AutoEP test
nv_execution_provider_utils.h	Added utility functions for detecting dynamic shapes in TensorRT tensors
nv_execution_provider_info.h	Removed context_memory_sharing_enable configuration option
nv_execution_provider.h	Updated OutputAllocator to use ORT allocator and modified state structures for dynamic memory management
nv_execution_provider.cc	Implemented dynamic context memory allocation logic and removed static memory sharing code

Copilot · 2025-08-02T07:19:09Z

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.h

 class OutputAllocator : public nvinfer1::IOutputAllocator {
 public:
+  OutputAllocator() = delete;
+  OutputAllocator(OrtAllocator* allocator) : alloc_(allocator) {};


The semicolon after the closing brace is unnecessary for constructor definitions. Remove the semicolon.

Suggested change

OutputAllocator(OrtAllocator* allocator) : alloc_(allocator) {};

OutputAllocator(OrtAllocator* allocator) : alloc_(allocator) {}

Copilot · 2025-08-02T07:19:10Z

onnxruntime/core/providers/nv_tensorrt_rtx/nv_execution_provider.cc

+    if (trt_state->context_memory_size != mem_size) {
+      LOGS_DEFAULT(INFO) << "[NvTensorRTRTX EP] A new context memory was allocated with size " << mem_size;
+      trt_state->context_memory = IAllocator::MakeUniquePtrFromOrtAllocator<void>(alloc, mem_size, false /*use_reserve*/);
+      // trt_state->context_memory = IAllocator::MakeUniquePtr<void>(alloc, mem_size, false /*use_reserve*/, stream);


This commented-out line should be removed as it appears to be leftover debug/alternative implementation code.

Suggested change

// trt_state->context_memory = IAllocator::MakeUniquePtr<void>(alloc, mem_size, false /*use_reserve*/, stream);

I want to keep this as a TODO for an improvement coming soon that uses AllocOnStream

jywu-msft · 2025-08-02T07:20:20Z

/azp run Linux QNN CI Pipeline,Win_TRT_Minimal_CUDA_Test_CI,Windows ARM64 QNN CI Pipeline,Windows GPU Doc Gen CI Pipeline,Windows x64 QNN CI Pipeline

azure-pipelines · 2025-08-02T07:20:38Z

Azure Pipelines successfully started running 5 pipeline(s).

gedoensmax · 2025-08-02T08:27:46Z

The windows runners seem to be stuck in setup phase.

jywu-msft · 2025-08-02T17:45:59Z

The windows runners seem to be stuck in setup phase.

restarted them.

skottmckay

gedoensmax · 2025-08-05T12:41:34Z

@jywu-msft Since this has been somewhat delayed and held us from opening more branches that build on top of these changes we cam up with a cumulative merge branch. #25656
This is duplicated in this branch now, how do you want us to proceed ?

chilo-ms · 2025-08-05T17:27:29Z

@jywu-msft Since this has been somewhat delayed and held us from opening more branches that build on top of these changes we cam up with a cumulative merge branch. #25656 This is duplicated in this branch now, how do you want us to proceed ?

Could you help update the PR description for that cumulative merge branch? I will review it.

gedoensmax · 2025-08-05T22:15:41Z

@chilo-ms I updated the description and left some more comments

chilo-ms · 2025-08-06T16:03:42Z

Close this PR since it's duplicated in #25656

gedoensmax added 5 commits July 28, 2025 17:56

fix CI due to unused function and misconfigured unit test

2812ec9

runtime workspace

68018b3

lintrunner

f3703e9

cleanup context_memory_sharing_enable option

262657b

fix for new AllocatorCreationInfo

a41b131

jywu-msft added the ep:NvRTX NV RTX execution provider label Jul 31, 2025

jywu-msft requested review from chilo-ms and Copilot August 2, 2025 07:18

Copilot AI reviewed Aug 2, 2025

View reviewed changes

jywu-msft requested a review from HectorSVC August 2, 2025 07:19

jywu-msft added the release:1.23.0 label Aug 2, 2025

skottmckay approved these changes Aug 5, 2025

View reviewed changes

chilo-ms closed this Aug 6, 2025

jywu-msft removed the release:1.23.0 label Aug 28, 2025

	OutputAllocator(OrtAllocator* allocator) : alloc_(allocator) {};
	OutputAllocator(OrtAllocator* allocator) : alloc_(allocator) {}

[NV TRT RTX EP] Leverage ORT allocator for workspace allocations #25564

[NV TRT RTX EP] Leverage ORT allocator for workspace allocations #25564

Uh oh!

Conversation

gedoensmax commented Jul 28, 2025

Description

Uh oh!

gaugarg-nv commented Jul 31, 2025

Uh oh!

gaugarg-nv commented Jul 31, 2025

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Aug 2, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI Aug 2, 2025

Choose a reason for hiding this comment

Uh oh!

gedoensmax Aug 2, 2025

Choose a reason for hiding this comment

Uh oh!

jywu-msft commented Aug 2, 2025

Uh oh!

azure-pipelines bot commented Aug 2, 2025

Uh oh!

gedoensmax commented Aug 2, 2025

Uh oh!

jywu-msft commented Aug 2, 2025

Uh oh!

skottmckay left a comment

Choose a reason for hiding this comment

Uh oh!

gedoensmax commented Aug 5, 2025

Uh oh!

chilo-ms commented Aug 5, 2025

Uh oh!

gedoensmax commented Aug 5, 2025

Uh oh!

chilo-ms commented Aug 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

5 participants