Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
61 commits
Select commit Hold shift + click to select a range
1d5ca89
init
chilo-ms Jan 21, 2025
e9119d5
include GraphTransformerManager to GetCapability
chilo-ms Jan 26, 2025
b7a0b79
Add GraphTransformerManager for EP, optimization function and Compute…
chilo-ms Jan 26, 2025
3b28ffc
refine GraphTransformerManager for EP, optimization function and Comp…
chilo-ms Jan 28, 2025
309341e
TRT EP creates optimization compute capability
chilo-ms Jan 28, 2025
d0cbc65
add comments
chilo-ms Jan 28, 2025
b239db0
remove unnecessary code
chilo-ms Jan 28, 2025
a83dd11
remove commented code
chilo-ms Jan 29, 2025
372342c
add a function to include DQ that is filtered out by TRT parser
chilo-ms Jan 29, 2025
39fa897
add standalone GraphOptimizerRegistry as singleton
chilo-ms Feb 3, 2025
627a00a
remove redundant code
chilo-ms Feb 3, 2025
06ca086
remove redundant code
chilo-ms Feb 3, 2025
a965ffb
remove redundant code
chilo-ms Feb 3, 2025
4c2697c
add back function
chilo-ms Feb 4, 2025
2b81789
changed code per reviewer
chilo-ms Feb 4, 2025
0c10cd4
don't create optimizer instances until EP requests it by calling GetO…
chilo-ms Feb 6, 2025
3360dfd
minor modification
chilo-ms Feb 6, 2025
5f7da9f
fix compiler error
chilo-ms Feb 7, 2025
e610bc8
remove unnecessary member function
chilo-ms Feb 7, 2025
e95f2c3
lintrunner -a
chilo-ms Feb 7, 2025
bad19b9
handle status
chilo-ms Feb 7, 2025
d4968cb
remove unnecessary code
chilo-ms Feb 7, 2025
df5aca9
add GetMutableMetaDef
chilo-ms Feb 9, 2025
60d9599
update TRT EP
chilo-ms Feb 9, 2025
3c46897
refactor the code per reviewer's suggestions
chilo-ms Feb 18, 2025
ee35614
remove unnecessary code
chilo-ms Feb 18, 2025
958706e
fix format
chilo-ms Feb 18, 2025
5ebb117
use session logger for optimization function
chilo-ms Feb 18, 2025
08b85f9
add ORT_UNUSED_PARAMETER for param
chilo-ms Feb 18, 2025
1e0ae2e
fix compiler warnings/errors
chilo-ms Feb 18, 2025
4ee99b6
fix compiler warning/error
chilo-ms Feb 18, 2025
718ab98
fix compiler warnings/errors
chilo-ms Feb 18, 2025
644b837
run ConstantFoldingDQ only when trt_dla_enable is true
chilo-ms Feb 20, 2025
a2bfa09
Merge branch 'main' into chi/ort_enable_l2_plus_opt_for_ep
chilo-ms Feb 21, 2025
5b8bb7b
fix compiler error when resolving the conflicts
chilo-ms Feb 21, 2025
46e09d3
fix compiler error when resolving the conflicts
chilo-ms Feb 21, 2025
56f0f52
lintrunner -a
chilo-ms Feb 21, 2025
178ea22
update according to the review
chilo-ms Feb 28, 2025
9072d5f
fix compile error
chilo-ms Feb 28, 2025
947ed88
update
chilo-ms Feb 28, 2025
570fd16
update
chilo-ms Feb 28, 2025
8c96398
update
chilo-ms Feb 28, 2025
9eadda9
fix some minor compile errors
chilo-ms Feb 28, 2025
6d1f4f1
Replace Singleton implementation with instance per session
chilo-ms Mar 4, 2025
39cef51
fix compile error
chilo-ms Mar 4, 2025
908c426
update
chilo-ms Mar 4, 2025
881e07b
fix error
chilo-ms Mar 4, 2025
f2e9cd2
fix format
chilo-ms Mar 4, 2025
192a394
fix error
chilo-ms Mar 4, 2025
656b691
fix compile error
chilo-ms Mar 4, 2025
68e3f18
fix error
chilo-ms Mar 4, 2025
2abc99c
add graph_optimizer_registry.cc in minimal build
chilo-ms Mar 4, 2025
6a3e6b3
fix compile error for minimal build
chilo-ms Mar 4, 2025
ce1cb6d
fix compile error for extended minimal build
chilo-ms Mar 4, 2025
b5a8815
modified as per reviewer
chilo-ms Mar 6, 2025
1a219f7
fix compile error
chilo-ms Mar 6, 2025
060f67a
modify GraphOptimizerRegistry for minimal build
chilo-ms Mar 6, 2025
ad63f3c
remove unnecessary code
chilo-ms Mar 6, 2025
aff6114
update ifdef for extended minimal build
chilo-ms Mar 6, 2025
c14da28
update
chilo-ms Mar 6, 2025
2a2a5ec
define alias first
chilo-ms Mar 7, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions cmake/onnxruntime_optimizer.cmake
Original file line number Diff line number Diff line change
Expand Up @@ -9,6 +9,7 @@ if (onnxruntime_MINIMAL_BUILD)
list(APPEND onnxruntime_optimizer_src_patterns
"${ONNXRUNTIME_INCLUDE_DIR}/core/optimizer/graph_transformer.h"
"${ONNXRUNTIME_ROOT}/core/optimizer/graph_transformer.cc"
"${ONNXRUNTIME_ROOT}/core/optimizer/graph_optimizer_registry.cc"
)

if (onnxruntime_EXTENDED_MINIMAL_BUILD)
Expand Down
16 changes: 16 additions & 0 deletions include/onnxruntime/core/framework/execution_provider.h
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ struct ComputeCapability;
class KernelRegistry;
struct KernelCreateInfo;
class Node;
class GraphOptimizerRegistry;
} // namespace onnxruntime
#else
#include <memory>
Expand Down Expand Up @@ -129,10 +130,25 @@ class IExecutionProvider {
and decide whether a node will be assigned to <*this> execution provider.
For kernels registered in a kernel registry, `kernel_lookup` must be used
to find a matching kernel for this EP.

The graph_optimizer_registry is designed for enabling L2+ graph optimizations tailored for EPs.
These optimizations are applied after the graph partitioner assigns ComputeCapability to the EP
and before EP's "Compile" or fusion.

Steps to use graph_optimizer_registry and create the optimization ComputeCapability:
1. Lookup Optimizer: The EP calls provider bridge API to lookup pre-defined optimizer by name and get selection function.
- Example: g_host->GetOptimizerByName(optimizer_name, graph_optimizer_registry, selection_func)
2. Run Selection Function: The EP executes the selection function to obtain the selection ComputeCapability.
- ComputeCapability.optimize_func would be set by the optimizer to the function that does the optimization.
3. Create Optimization ComputeCapability: The EP uses the selection ComputeCapability to create the optimization ComputeCapability.
4. Return ComputeCapability: The EP returns the final ComputeCapability, with nodes_to_optimize set to the optimization ComputeCapability.

Note: For more detailed implementations of using graph_optimizer_registry, please refer to TensorRT EP.
*/
virtual std::vector<std::unique_ptr<ComputeCapability>>
GetCapability(const onnxruntime::GraphViewer& graph_viewer,
const IKernelLookup& kernel_lookup,
const GraphOptimizerRegistry& graph_optimizer_registry,
IResourceAccountant* resource_accountant = nullptr) const;

/**
Expand Down
6 changes: 6 additions & 0 deletions include/onnxruntime/core/graph/indexed_sub_graph.h
Original file line number Diff line number Diff line change
Expand Up @@ -72,6 +72,12 @@ struct IndexedSubGraph {
return meta_def_.get();
}

/** Gets the mutable meta definition needed to represent this subgraph as a FunctionProto.
@returns MetaDef instance if it has been set. nullptr if not. */
MetaDef* GetMutableMetaDef() {
return meta_def_.get();
}

// Check if the accounting is enabled for the current EP
bool IsAccountingEnabled() const {
return resource_accountant != nullptr &&
Expand Down
20 changes: 20 additions & 0 deletions onnxruntime/core/framework/compute_capability.h
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,11 @@
// Licensed under the MIT License.

#pragma once
#include <functional>
#include "core/common/common.h"
#include "core/graph/indexed_sub_graph.h"
#include "core/graph/graph.h"
#include "core/optimizer/graph_optimizer_registry.h"

namespace onnxruntime {
// A structure encodes a subgraph and the method to run it.
Expand All @@ -21,5 +24,22 @@

ComputeCapability(std::unique_ptr<IndexedSubGraph> t_sub_graph)
: sub_graph(std::move(t_sub_graph)) {}

// Optional function to optimize this ComputeCapability.
// This will be called by ORT once the ComputeCapability is assigned to the EP.
std::function<Status(Graph&,
const ComputeCapability& /* this_optimization*/,
ComputeCapability& /* cc_to_update */,
const GraphOptimizerRegistry&)>
optimization_func;

// Optional ComputeCapability instances for sets of nodes within this ComputeCapability that should be optimized.
// when an optimization is applied, ORT will update this ComputeCapability to reflect the changes made.
// IndexedSubGraph.nodes:
// - update based on RemovedNode/AddNode calls
// IndexedSubGraph.MetaDef (if present):
// - inputs and outputs will be unchanged
// - constant_initializers MAY change if we constant fold an initializer during optimization
std::vector<std::unique_ptr<ComputeCapability>> nodes_to_optimize;

Check warning on line 43 in onnxruntime/core/framework/compute_capability.h

View workflow job for this annotation

GitHub Actions / Optional Lint C++

[cpplint] reported by reviewdog 🐶 Add #include <vector> for vector<> [build/include_what_you_use] [4] Raw Output: onnxruntime/core/framework/compute_capability.h:43: Add #include <vector> for vector<> [build/include_what_you_use] [4]
};
} // namespace onnxruntime
1 change: 1 addition & 0 deletions onnxruntime/core/framework/execution_provider.cc
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,7 @@ namespace onnxruntime {
std::vector<std::unique_ptr<ComputeCapability>>
IExecutionProvider::GetCapability(const onnxruntime::GraphViewer& graph,
const IKernelLookup& kernel_lookup,
const GraphOptimizerRegistry&,
IResourceAccountant*) const {
std::vector<std::unique_ptr<ComputeCapability>> result;
for (const auto& node : graph.Nodes()) {
Expand Down
Loading
Loading