Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion about.html
Original file line number Diff line number Diff line change
Expand Up @@ -88,7 +88,7 @@ <h1 class="mb-3 blue-text">About</h1>
<div class="col-12 col-md-6 pr-10">
<h2>Optimization and acceleration</h2>
<p>
Run any ONNX model using a single set of inference <a href="https://www.onnxruntime.ai/docs/api/" target="_blank" class="link"><abbr title="Application Program Interface">API</abbr>s</a> that provide access to the best hardware acceleration available. Built-in optimization features trim and consolidate nodes without impacting model accuracy. Additionally, full backwards <a href="https://www.onnxruntime.ai/docs/resources/compatibility.html" target="_blank" class="link">compatibility</a> for ONNX and ONNX-<abbr>ML</abbr> ensures all ONNX models can be inferenced.
Run any ONNX model using a single set of inference <a href="https://www.onnxruntime.ai/docs/api/" target="_blank" class="link"><abbr title="Application Program Interface">API</abbr>s</a> that provide access to the best hardware acceleration available. Built-in optimization features trim and consolidate nodes without impacting model accuracy. Additionally, full backwards <a href="https://www.onnxruntime.ai/docs/reference/compatibility.html" target="_blank" class="link">compatibility</a> for ONNX and ONNX-<abbr>ML</abbr> ensures all ONNX models can be inferenced.
</p>
</div>
</div>
Expand Down
3 changes: 1 addition & 2 deletions docs/api/csharp-api.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
---
title: C# API
parent: API Docs
nav_order: 2
nav_exclude: true
---

# C# API Reference
Expand Down
16 changes: 8 additions & 8 deletions docs/api/index.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,16 @@
---
title: API Docs
has_children: true
nav_order: 5
---
# ORT API docs
{: .no_toc }

|:----------------------------------------------------------------------------------|
| <span class="fs-5"> [Python API Docs](./python/api_summary.html){: .btn } </span> |
| <span class="fs-5"> [Java API Docs](./java/index.html){: .btn} </span> |
| <span class="fs-5"> [Objective-C Docs](./objectivec/index.html){: .btn} </span> |
| <span class="fs-5"> [WinRT API Docs](https://docs.microsoft.com/windows/ai/windows-ml/api-reference){: .btn} </span>|
| <span class="fs-5"> [C# API Docs](./csharp-api){: .btn} </span>|
| <span class="fs-5"> [JavaScript API Docs](./js/index.html){: .btn} </span>|
| <span class="fs-5"> [Other API Docs](./other-apis){: .btn} </span>|
| <span class="fs-5"> [Python API Docs](https://onnxruntime.ai/docs/api/python/api_summary.html){: .btn target="_blank"} </span> |
| <span class="fs-5"> [Java API Docs](https://onnxruntime.ai/docs/api/java/index.html){: .btn target="_blank"} </span> |
| <span class="fs-5"> [C# API Docs](./csharp-api){: .btn target="_blank"} </span>|
| <span class="fs-5"> [C/C++ API Docs](https://onnxruntime.ai/docs/api/c/){: .btn target="_blank"} </span>|
| <span class="fs-5"> [WinRT API Docs](https://docs.microsoft.com/en-us/windows/ai/windows-ml/api-reference){: .btn target="_blank"} </span>|
| <span class="fs-5"> [Objective-C Docs](https://onnxruntime.ai/docs/api/objectivec/index.html){: .btn target="_blank"} </span> |
| <span class="fs-5"> [JavaScript API Docs](https://onnxruntime.ai/docs/api/js/index.html){: .btn target="_blank"} </span>|
| <span class="fs-5"> [Other API Docs](./other-apis){: .btn target="_blank"} </span>|
10 changes: 0 additions & 10 deletions docs/api/java-api.md

This file was deleted.

10 changes: 0 additions & 10 deletions docs/api/js-api.md

This file was deleted.

10 changes: 0 additions & 10 deletions docs/api/objectivec-api.md

This file was deleted.

3 changes: 1 addition & 2 deletions docs/api/other-apis.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
---
title: Other Inference APIs
parent: API Docs
nav_order: 7
nav_exclude: true
---

# Other APIs
Expand Down
11 changes: 0 additions & 11 deletions docs/api/python-api.md

This file was deleted.

10 changes: 0 additions & 10 deletions docs/api/winrt-api.md

This file was deleted.

2 changes: 1 addition & 1 deletion docs/build/eps.md
Original file line number Diff line number Diff line change
Expand Up @@ -188,7 +188,7 @@ These instructions are for JetPack SDK 4.6.

## oneDNN

See more information on ondDNN (formerly DNNL) [here](../execution-providers/DNNL-ExecutionProvider.md).
See more information on ondDNN (formerly DNNL) [here](../execution-providers/oneDNN-ExecutionProvider.md).

### Build Instructions
{: .no_toc }
Expand Down
2 changes: 1 addition & 1 deletion docs/build/inferencing.md
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Also, if you want to cross-compile for Apple Silicon in an Intel-based MacOS mac
#### Notes

* Please note that these instructions build the debug build, which may have performance tradeoffs
* To build the version from each release (which include Windows, Linux, and Mac variants), see these .yml files for reference: [CPU](https://github.com/microsoft/onnxruntime/blob/master/tools/ci_build/github/azure-pipelines/nuget/cpu-esrp-pipeline.yml), [GPU](https://github.com/microsoft/onnxruntime/blob/master/tools/ci_build/github/azure-pipelines/nuget/gpu-esrp-pipeline.yml)
* To build the version from each release (which include Windows, Linux, and Mac variants), see these [.yml files](https://github.com/microsoft/onnxruntime/tree/master/tools/ci_build/github/azure-pipelines/nuget) for reference
* The build script runs all unit tests by default for native builds and skips tests by default for cross-compiled builds.
To skip the tests, run with `--build` or `--update --build`.
* If you need to install protobuf 3.6.1 from source code (cmake/external/protobuf), please note:
Expand Down
2 changes: 1 addition & 1 deletion docs/build/reduced.md
Original file line number Diff line number Diff line change
Expand Up @@ -11,7 +11,7 @@ For applications where package binary size is important, ONNX Runtime provides o

To reduce the compiled binary size of ONNX Runtime, the operator kernels included in the build can be reduced to just the kernels required by your model/s.

For deployment on mobile devices specifically, please read more detailed guidance on [How to: Build for mobile](./mobile.md).
For deployment on mobile devices specifically, please read more detailed guidance on [Deploy ONNX Runtime Mobile](../tutorials/mobile/).

## Contents
{: .no_toc }
Expand Down
4 changes: 2 additions & 2 deletions docs/ecosystem/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ ONNX Runtime functions as part of an ecosystem of tools and platforms to deliver
* [Azure Container Instance: Facial Expression Recognition](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/onnx/onnx-inference-facial-expression-recognition-deploy.ipynb){:target="_blank"}
* [Azure Container Instance: MNIST](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/onnx/onnx-inference-mnist-deploy.ipynb){:target="_blank"}
* [Azure Container Instance: Image classification (Resnet)](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/deployment/onnx/onnx-modelzoo-aml-deploy-resnet50.ipynb){:target="_blank"}
* [Azure Kubernetes Services: FER+](https://github.com/microsoft/onnxruntime/tree/master/docs/python/notebooks/onnx-inference-byoc-gpu-cpu-aks.ipynb){:target="_blank"}
* [Azure Kubernetes Services: FER+](https://github.com/microsoft/onnxruntime/blob/master/docs/python/inference/notebooks/onnx-inference-byoc-gpu-cpu-aks.ipynb){:target="_blank"}
* [Azure IoT Sedge (Intel UP2 device with OpenVINO)](https://github.com/Azure-Samples/onnxruntime-iot-edge/blob/master/AzureML-OpenVINO/README.md){:target="_blank"}
* [Automated Machine Learning](https://github.com/Azure/MachineLearningNotebooks/blob/master/how-to-use-azureml/automated-machine-learning/classification-bank-marketing-all-features/auto-ml-classification-bank-marketing-all-features.ipynb){:target="_blank"}

Expand All @@ -31,7 +31,7 @@ ONNX Runtime functions as part of an ecosystem of tools and platforms to deliver
* [Azure Video Analytics: YOLOv3 and TinyYOLOv3](https://github.com/Azure/live-video-analytics/tree/master/utilities/video-analysis/yolov3-onnx){:target="_blank"}

## Azure SQL Edge
* [ML predictions in Azure SQL Edge and Azure SQL Managed Instance](https://docs.microsoft.com/en-us/azure/azure-sql-edge/deploy-onnxJ){:target="_blank"}
* [ML predictions in Azure SQL Edge and Azure SQL Managed Instance](https://docs.microsoft.com/en-us/azure/azure-sql-edge/deploy-onnx){:target="_blank"}

## Azure Synapse Analytics
* [ML predictions in Synapse SQL](https://docs.microsoft.com/en-us/azure/synapse-analytics/sql-data-warehouse/sql-data-warehouse-predict){:target="_blank"}
Expand Down
6 changes: 3 additions & 3 deletions docs/execution-providers/ArmNN-ExecutionProvider.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ nav_order: 3
[ArmNN](https://github.com/ARM-software/armnn) is an open source inference engine maintained by Arm and Linaro companies. The integration of ArmNN as an execution provider (EP) into ONNX Runtime accelerates performance of ONNX model workloads across Armv8 cores.

## Build
For build instructions, please see the [BUILD page](./build/eps.md#armnn).
For build instructions, please see the [BUILD page](../build/eps.md#armnn).

## Usage
### C/C++
Expand All @@ -27,9 +27,9 @@ Ort::SessionOptions so;
bool enable_cpu_mem_arena = true;
Ort::ThrowOnError(OrtSessionOptionsAppendExecutionProvider_ArmNN(so, enable_cpu_mem_arena));
```
The C API details are [here](./get-started/with-c.html.md).
The C API details are [here](../get-started/with-c.md).

## Performance Tuning
For performance tuning, please see guidance on this page: [ONNX Runtime Perf Tuning](./performance/tune-performance.md)
For performance tuning, please see guidance on this page: [ONNX Runtime Perf Tuning](../performance/tune-performance.md)

When/if using [onnxruntime_perf_test](https://github.com/microsoft/onnxruntime/tree/master/onnxruntime/test/perftest), use the flag -e armnn
3 changes: 2 additions & 1 deletion docs/execution-providers/CUDA-ExecutionProvider.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,14 +17,15 @@ The CUDA Execution Provider enables hardware accelerated computation on Nvidia C
{:toc}

## Install
Pre-built binaries of ONNX Runtime with CUDA EP are published for most language bindings. Please reference [How to - Install ORT](../install.html#inference).
Pre-built binaries of ONNX Runtime with CUDA EP are published for most language bindings. Please reference [Install ORT](../install).


## Requirements
Please reference table below for official GPU packages dependencies for the ONNX Runtime inferencing package. Note that ONNX Runtime Training is aligned with PyTorch CUDA versions; refer to the Training tab on https://onnxruntime.ai/ for supported versions.

|ONNX Runtime|CUDA|cuDNN|Notes|
|---|---|---|---|
|1.9|11.4|8.2.4 (Linux)<br/>8.2.2.26 (Windows)|libcudart 11.4.43<br/>libcufft 10.5.2.100<br/>libcurand 10.2.5.120<br/>libcublasLt 11.6.1.51<br/>libcublas 11.6.1.51<br/>libcudnn 8.2.4<br/>libcupti.so 2021.2.2|
|1.8|11.0.3|8.0.4 (Linux)<br/>8.0.2.39 (Windows)|libcudart 11.0.221<br/>libcufft 10.2.1.245<br/>libcurand 10.2.1.245<br/>libcublasLt 11.2.0.252<br/>libcublas 11.2.0.252<br/>libcudnn 8.0.4<br/>libcupti.so 2020.1.1|
|1.7|11.0.3|8.0.4 (Linux)<br/>8.0.2.39 (Windows)|libcudart 11.0.221<br/>libcufft 10.2.1.245<br/>libcurand 10.2.1.245<br/>libcublasLt 11.2.0.252<br/>libcublas 11.2.0.252<br/>libcudnn 8.0.4|
|1.5-1.6|10.2|8.0.3|CUDA 11 can be built from source|
Expand Down
2 changes: 1 addition & 1 deletion docs/execution-providers/DirectML-ExecutionProvider.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,7 +22,7 @@ The DirectML Execution Provider currently uses DirectML version 1.4.2.
{:toc}

## Install
Pre-built packages of ORT with the DirectML EP is published on Nuget.org. See [How to: Install ORT](./install).
Pre-built packages of ORT with the DirectML EP is published on Nuget.org. See: [Install ORT](../install).

## Requirements

Expand Down
4 changes: 2 additions & 2 deletions docs/execution-providers/MIGraphX-ExecutionProvider.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,12 +31,12 @@ Ort::ThrowOnError(OrtSessionOptionsAppendExecutionProvider_MiGraphX(sf, device_i

You can check [here](https://github.com/scxiao/ort_test/tree/master/char_rnn) for a specific c/c++ program.

The C API details are [here](../get-started/with-c.html.md).
The C API details are [here](../get-started/with-c.md).

### Python
When using the Python wheel from the ONNX Runtime build with MIGraphX execution provider, it will be automatically
prioritized over the default GPU or CPU execution providers. There is no need to separately register the execution
provider. Python APIs details are [here](/python/api_summary).
provider. Python APIs details are [here](https://onnxruntime.ai/docs/api/python/api_summary.html).
*Note that the next release (ORT 1.10) will require explicitly setting the providers parameter if you want to use execution provider other than the default CPU provider when instantiating InferenceSession.*

You can check [here](https://github.com/scxiao/ort_test/tree/master/python/run_onnx) for a python script to run an
Expand Down
15 changes: 13 additions & 2 deletions docs/execution-providers/TensorRT-ExecutionProvider.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,17 @@ With the TensorRT execution provider, the ONNX Runtime delivers better inferenci
## Install
Pre-built packages and Docker images are available for Jetpack in the [Jetson Zoo](https://elinux.org/Jetson_Zoo#ONNX_Runtime).

## Requirements

|ONNX Runtime|TensorRT|CUDA|
|---|---|---|
|1.9|8.0|11.4|
|1.7-1.8|7.2|11.0.3|
|1.5-1.6|7.1|10.2|
|1.2-1.4|7.0|10.1|
|1.0-1.1|6.0|10.0|

For more details on CUDA/cuDNN versions, please see [CUDA EP requirements](./CUDA-ExecutionProvider.md#requirements).

## Build

Expand All @@ -38,7 +49,7 @@ Ort::ThrowOnError(OrtSessionOptionsAppendExecutionProvider_CUDA(sf, device_id));
Ort::Session session(env, model_path, sf);
```

The C API details are [here](../get-started/with-c.html.md).
The C API details are [here](../get-started/with-c.md).

#### Shape Inference for TensorRT Subgraphs
If some operators in the model are not supported by TensorRT, ONNX Runtime will partition the graph and only send supported subgraphs to TensorRT execution provider. Because TensorRT requires that all inputs of the subgraphs have shape specified, ONNX Runtime will throw error if there is no input shape info. In this case please run shape inference for the entire model first by running script [here](https://github.com/microsoft/onnxruntime/blob/master/onnxruntime/python/tools/symbolic_shape_infer.py).
Expand Down Expand Up @@ -175,7 +186,7 @@ sess.set_providers(["TensorrtExecutionProvider"],[{'device_id': '1', 'trt_max_wo
```

## Performance Tuning
For performance tuning, please see guidance on this page: [ONNX Runtime Perf Tuning](../tutorials/mobile/tune-performance.md)
For performance tuning, please see guidance on this page: [ONNX Runtime Perf Tuning](../performance/tune-performance.md)

When/if using [onnxruntime_perf_test](https://github.com/microsoft/onnxruntime/tree/master/onnxruntime/test/perftest#onnxruntime-performance-test), use the flag `-e tensorrt`

Expand Down
2 changes: 1 addition & 1 deletion docs/execution-providers/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ ONNX Runtime supports many different execution providers today. Some of the EPs

### Add an Execution Provider

Developers of specialized HW acceleration solutions can integrate with ONNX Runtime to execute ONNX models on their stack. To create an EP to interface with ONNX Runtime you must first identify a unique name for the EP. Follow the steps outlined [here](../execution-provider/add-execution-provider.md) to integrate your code in the repo.
Developers of specialized HW acceleration solutions can integrate with ONNX Runtime to execute ONNX models on their stack. To create an EP to interface with ONNX Runtime you must first identify a unique name for the EP. See: [Add a new execution provider](add-execution-provider.md) for detailed instructions.

### Build ONNX Runtime package with EPs

Expand Down
2 changes: 1 addition & 1 deletion docs/get-started/training-pytorch.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: ORT Training with PyTorch
parent: Get Started
nav_order: 9
nav_order: 11
---

# Get started with ORT for Training API (PyTorch)
Expand Down
4 changes: 2 additions & 2 deletions docs/get-started/with-c.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: C
parent: Get Started
nav_exclude: true
nav_order: 4
---

# Get started with ORT for C
Expand Down Expand Up @@ -47,7 +47,7 @@ Refer to [onnxruntime_c_api.h](https://github.com/microsoft/onnxruntime/blob/mas
* Converting an in-memory ONNX Tensor encoded in protobuf format to a pointer that can be used as model input.
* Setting the thread pool size for each session.
* Setting graph optimization level for each session.
* Dynamically loading custom ops. [Instructions](../tutorials/mobile/add-custom-op.md)
* Dynamically loading custom ops. [Instructions](../reference/operators/add-custom-op.md)
* Ability to load a model from a byte array. See ```OrtCreateSessionFromArray``` in [onnxruntime_c_api.h](https://github.com/microsoft/onnxruntime/blob/master/include/onnxruntime/core/session/onnxruntime_c_api.h).
* **Global/shared threadpools:** By default each session creates its own set of threadpools. In situations where multiple
sessions need to be created (to infer different models) in the same process, you end up with several threadpools created
Expand Down
6 changes: 3 additions & 3 deletions docs/get-started/with-cpp.md
Original file line number Diff line number Diff line change
Expand Up @@ -17,15 +17,15 @@ nav_order: 2

| Artifact | Description | Supported Platforms |
|-----------|-------------|---------------------|
| [Microsoft.ML.OnnxRuntime](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime) | CPU (Release) |Windows, Linux, Mac, X64, X86 (Windows-only), ARM64 (Windows-only)...more details: [compatibility](../references/compatibility) |
| [Microsoft.ML.OnnxRuntime.Gpu](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime.gpu) | GPU - CUDA (Release) | Windows, Linux, Mac, X64...more details: [compatibility](../references/compatibility) |
| [Microsoft.ML.OnnxRuntime](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime) | CPU (Release) |Windows, Linux, Mac, X64, X86 (Windows-only), ARM64 (Windows-only)...more details: [compatibility](../reference/compatibility.md) |
| [Microsoft.ML.OnnxRuntime.Gpu](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime.gpu) | GPU - CUDA (Release) | Windows, Linux, Mac, X64...more details: [compatibility](../reference/compatibility.md) |
| [Microsoft.ML.OnnxRuntime.DirectML](https://www.nuget.org/packages/Microsoft.ML.OnnxRuntime.directml) | GPU - DirectML (Release) | Windows 10 1709+ |
| [ort-nightly](https://aiinfra.visualstudio.com/PublicPackages/_packaging?_a=feed&feed=ORT-Nightly) | CPU, GPU (Dev) | Same as Release versions |

.zip and .tgz files are also included as assets in each [Github release](https://github.com/microsoft/onnxruntime/releases).

## API Reference
The C++ API is a thin wrapper of the C API. Please refer to [C API](./with-c.html) for more details.
The C++ API is a thin wrapper of the C API. Please refer to [C API](./with-c.md) for more details.

## Samples
See [Tutorials: API Basics - C++](../tutorials/api-basics)
2 changes: 1 addition & 1 deletion docs/get-started/with-iot.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: IoT and Edge
parent: Get Started
toc: true
nav_order: 7
nav_order: 10
---
# Get Started with ORT for IoT
{: .no_toc }
Expand Down
2 changes: 1 addition & 1 deletion docs/get-started/with-javascript.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: JavaScript
parent: Get Started
toc: true
nav_order: 4
nav_order: 6
---

# Get started with ORT for JavaScript
Expand Down
4 changes: 2 additions & 2 deletions docs/get-started/with-mobile.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
title: Mobile
parent: Get Started
toc: true
nav_order: 6
nav_order: 7
---
# Get Started with ORT for Mobile
{: .no_toc }
Expand Down Expand Up @@ -46,4 +46,4 @@ OrtSession session = env.createSession(<path to model>, opsession_optionstions);
```

## Learn More
- [Mobile Tutorial](./Tutorials/Mobile/)
- [Mobile Tutorial](../tutorials/mobile)
2 changes: 1 addition & 1 deletion docs/get-started/with-obj-c.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
---
title: Objective-C
parent: Get Started
nav_exclude: true
nav_order: 8
---
# Get started with ORT for Objective-C
{: .no_toc }
Expand Down
Loading