Filter data (Weights) reorder optimization #134

sreekanth-yalachigere · 2018-12-10T06:03:10Z

Filter data is constant for all iterations and currently we are reordering filter memory layout in every iteration. By saving filter memory layout in execution provider, we can avoid 53 data reorders and reduce latency (up to 10 ms)

onnxruntime/core/providers/mkldnn/mkldnn_execution_provider.h

onnxruntime/core/providers/mkldnn/nn/conv.cc

onnxruntime/core/providers/mkldnn/nn/conv.h

onnxruntime/core/providers/mkldnn/mkldnn_execution_provider.h

onnxruntime/core/providers/mkldnn/nn/conv.h

onnxruntime/core/providers/mkldnn/nn/conv.cc

onnxruntime/core/providers/mkldnn/mkldnn_execution_provider.h

jywu-msft

please address review feedback, including thread safety, non-unique key across models...

…pdating the type and shape info. (#195)

* add check before fusing sub-graph in greedy partitioning * update the partitioning logic to 1) not fuse sub-graph if inner nodes were assigned 2) avoid resolving graph after each provider capability checking and assignment. * resolve conflicts

* define gather_nd op * add test cases * add test file * refactor the code and doc * add test cases * fix win compile err * fix win compile err * adjust indent * make constructor explicit * add coment * remove templates * remove wrong def * migrate macros * fix an issue in shape inference

Allow using MKLML header/libs when use_mklml is specified

* Fixed out of bounds access in ArrayFeatureExtractor. * some cleanup * Updated tensor_shape.h comments. * Updated macro name. * Added copy assignment, move assignment/ctor to TensorShape. * Removed i64 literal suffix. * Fixed test. * Fixed type of x_num_dims.

* Minor updates to exception message * update models folder to new location * update copy to preservenewest

* More Ort prefix changes for consistency * Fix C# methods * More C# fixes

…inor cleanup. (#205)

* update onnx

…t NodeArg usage. Allows using an initializer from multiple levels up to not fail. We would need to accumulate a list of initializers from all levels up otherwise, and doing so doesn't add any value. (#200) Improve a comment to clarify when the parent graph NodeArg lookup kicks in.

* Adding the include folder for the C Windows pkg. * Add import lib to the pkg * Disable csharp pretrained tests temporarily

… free a re-used output that is used for a dead output (output with zero users). (#214)

- apply any transforms to the main graph and any subgraphs first - call Graph::Resolve() once on the main graph, which will recurse into the subgraphs - previously it was called after the transform on each subgraph, which results in it traversing up to the main graph to call resolve, and that resolve call recursing into all subgraphs every time. This avoids lots of unnecessary Graph::Resolve calls, and prevents subgraphs from being broken by SessionStateInitializer::InitializeAndSave calling graph_.CleanAllInitializedTensors() prior to final Graph::Resolve call. If a subgraph has optional inputs the backing initializers were removed by CleanAllInitializedTensors causing the next Resolve to incorrectly turn them into required inputs.

* placeholder for internal contrib ops * remove useless internal file * fix build break

This helps identify fp accuracy issues

* Initial commit Maxunpool operator * fix gpu build failure * remove op test from excluded list * Change to ORT

* Minor updates to exception message * update models folder to new location * update copy to preservenewest * reenable pretrained test * added some debugging info for build * update pretrained test, and tensor proto definition

* refactor the kernel memory type interface * remove useless change * fix comments in PR

* More intuitive ordering to the API functions * Rename TCHAR_T

…xruntime

sreekanth-yalachigere · 2018-12-26T20:04:55Z

creating new PR.

The onnxruntime_add_shared_library_module() command places generated DLLs in the lib directory instead of the bin on Windows. The PR changes to use onnxruntime_add_shared_library(). I checked most of the essential EPs, and they all use onnxruntime_add_shared_library() as the method to create execution provider targets. The difference between the module and non-module versions of the command is that the module version does not install .lib files for targets it creates.

sreekanth-yalachigere requested a review from a team as a code owner December 10, 2018 06:03

snnn reviewed Dec 10, 2018

View reviewed changes

onnxruntime/core/providers/mkldnn/mkldnn_execution_provider.h Outdated Show resolved Hide resolved

onnxruntime/core/providers/mkldnn/nn/conv.cc Outdated Show resolved Hide resolved

onnxruntime/core/providers/mkldnn/nn/conv.h Outdated Show resolved Hide resolved

jywu-msft reviewed Dec 10, 2018

View reviewed changes

onnxruntime/core/providers/mkldnn/mkldnn_execution_provider.h Outdated Show resolved Hide resolved

jywu-msft reviewed Dec 10, 2018

View reviewed changes

onnxruntime/core/providers/mkldnn/nn/conv.h Outdated Show resolved Hide resolved

jywu-msft reviewed Dec 10, 2018

View reviewed changes

onnxruntime/core/providers/mkldnn/nn/conv.cc Outdated Show resolved Hide resolved

jywu-msft reviewed Dec 10, 2018

View reviewed changes

onnxruntime/core/providers/mkldnn/mkldnn_execution_provider.h Outdated Show resolved Hide resolved

jywu-msft requested changes Dec 10, 2018

View reviewed changes

sreekanth-yalachigere and others added 23 commits December 17, 2018 12:57

Filter data (Weights) reorder optimization

c45d0cb

check provider_ for nullptr

8904b82

PR Review changes: thread safe. weights map object private

e8cdd63

using Conv Parameters key to make weights id unique

f95cea9

PR Review changes: thread safe. weights map object private

d0f15b1

removed convParam string from weight key. Weight Id is unique

cf57d17

directly updating dst arg with src arg when adding an edge. no need u…

5a8acd7

…pdating the type and shape info. (#195)

Kezhan/partition logic update (#164)

82d0441

* add check before fusing sub-graph in greedy partitioning * update the partitioning logic to 1) not fuse sub-graph if inner nodes were assigned 2) avoid resolving graph after each provider capability checking and assignment. * resolve conflicts

Allow using MKLML header/libs when use_mklml is specified (#178)

b0f27ba

Allow using MKLML header/libs when use_mklml is specified

fix typo

8c5d105

C# Gpu : Minor updates to exception message (#201)

0287019

* Minor updates to exception message * update models folder to new location * update copy to preservenewest

More C header naming changes (#202)

773114a

* More Ort prefix changes for consistency * Fix C# methods * More C# fixes

Updated build.py to support relative CMake/CTest paths and did some m…

0aa1b54

…inor cleanup. (#205)

Disable csharp pretrained tests temporarily (#207)

d4131a3

Clean up garbage files (#208)

c0ec7d5

add gemmlowp as submodule. (#206)

37b74c7

update onnx (#209)

dc8b37f

* update onnx

Adding the include folder for the C Windows pkg. (#198)

39f47f8

* Adding the include folder for the C Windows pkg. * Add import lib to the pkg * Disable csharp pretrained tests temporarily

Increment/decrement UseCount for outputs so that we don't prematurely…

334e329

… free a re-used output that is used for a dead output (output with zero users). (#214)

souptc and others added 25 commits December 19, 2018 18:17

remove useless internal schema file (#226)

0dca080

* placeholder for internal contrib ops * remove useless internal file * fix build break

Print hex value for float compare when test failed (#228)

abce604

This helps identify fp accuracy issues

update mkldnn to 0.17.2 (#231)

eb867be

MaxUnpool Operator - CPU Implementation (#177)

a19b624

* Initial commit Maxunpool operator * fix gpu build failure * remove op test from excluded list * Change to ORT

Jignparm/csharp gpu (#221)

a43382e

* Minor updates to exception message * update models folder to new location * update copy to preservenewest * reenable pretrained test * added some debugging info for build * update pretrained test, and tensor proto definition

update kernel memory type interface (#225)

c453b48

* refactor the kernel memory type interface * remove useless change * fix comments in PR

Filter data (Weights) reorder optimization

4f8af6a

check provider_ for nullptr

8808e14

PR Review changes: thread safe. weights map object private

7c2a8d8

using Conv Parameters key to make weights id unique

32ee169

removed convParam string from weight key. Weight Id is unique

fe640a8

merge

e98e362

More intuitive ordering to the API functions (#233)

a37887c

* More intuitive ordering to the API functions * Rename TCHAR_T

Filter data (Weights) reorder optimization

c7ad8d9

check provider_ for nullptr

9787cc8

PR Review changes: thread safe. weights map object private

4460531

using Conv Parameters key to make weights id unique

dbb53a1

PR Review changes: thread safe. weights map object private

f321cf9

removed convParam string from weight key. Weight Id is unique

591fa08

check provider_ for nullptr

42a710a

using Conv Parameters key to make weights id unique

dff5560

weight name as key

47359cb

Merge branch 'master' of https://github.com/sreekanth-yalachigere/onn…

3e757bb

…xruntime

weights as key

345b440

Relu and Sum

431344e

sreekanth-yalachigere closed this Dec 26, 2018

brainchip-india mentioned this pull request Mar 13, 2025

[Build] Unsupported AVX512-FP16 Instructions in MLAS (vcvtneeph2ps, vcvtneoph2ps) #24025

Closed

TedThemistokleous added a commit to TedThemistokleous/onnxruntime that referenced this pull request Jul 9, 2025

Correct tidy warnings on MIGraphXAllocator (microsoft#134)

af57264

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter data (Weights) reorder optimization #134

Filter data (Weights) reorder optimization #134

Uh oh!

sreekanth-yalachigere commented Dec 10, 2018

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jywu-msft left a comment

Uh oh!

sreekanth-yalachigere commented Dec 26, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants

Filter data (Weights) reorder optimization #134

Filter data (Weights) reorder optimization #134

Uh oh!

Conversation

sreekanth-yalachigere commented Dec 10, 2018

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

jywu-msft left a comment

Choose a reason for hiding this comment

Uh oh!

sreekanth-yalachigere commented Dec 26, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

17 participants