Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
36 changes: 36 additions & 0 deletions docs/execution-providers/CUDA-ExecutionProvider.md
Original file line number Diff line number Diff line change
Expand Up @@ -84,6 +84,7 @@ Default value: true

### cudnn_conv_use_max_workspace
Check [tuning performance for convolution heavy models](../performance/tune-performance.md#convolution-heavy-models-and-the-cuda-ep) for details on what this flag does.
This flag is only supported from the V2 version of the provider options struct when used using the C API. The V2 provider options struct can be created using [this](https://onnxruntime.ai/docs/api/c/struct_ort_api.html#a0d29cbf555aa806c050748cf8d2dc172) and updated using [this](https://onnxruntime.ai/docs/api/c/struct_ort_api.html#a4710fc51f75a4b9a75bde20acbfa0783). Please take a look at the sample below for an example.

Default value: 0

Expand Down Expand Up @@ -112,6 +113,8 @@ session = ort.InferenceSession(model_path, providers=providers)

### C/C++

#### Using legacy provider options struct

```c++
OrtSessionOptions* session_options = /* ... */;

Expand All @@ -125,3 +128,36 @@ options.do_copy_in_default_stream = 1;
SessionOptionsAppendExecutionProvider_CUDA(session_options, &options);
```

#### Using V2 provider options struct

```c++
OrtCUDAProviderOptionsV2* cuda_options = nullptr;
CreateCUDAProviderOptions(&cuda_options);

std::vector<const char*> keys{"device_id", "gpu_mem_limit", "arena_extend_strategy", "cudnn_conv_algo_search", "do_copy_in_default_stream", "cudnn_conv_use_max_workspace"};
std::vector<const char*> values{"0", "2147483648", "kSameAsRequested", "DEFAULT", "1", "1"};

UpdateCUDAProviderOptions(cuda_options, keys.data(), values.data(), 6);

OrtSessionOptions* session_options = /* ... */;
SessionOptionsAppendExecutionProvider_CUDA_V2(session_options, cuda_options);

// Finally, don't forget to release the provider options
ReleaseCUDAProviderOptions(cuda_options);
```

```c#
var cudaProviderOptions = new OrtCUDAProviderOptions(); // Dispose this finally

var providerOptionsDict = new Dictionary<string, string>();
providerOptionsDict["device_id"] = "0";
providerOptionsDict["gpu_mem_limit"] = "2147483648";
providerOptionsDict["arena_extend_strategy"] = "kSameAsRequested";
providerOptionsDict["cudnn_conv_algo_search"] = "DEFAULT";
providerOptionsDict["do_copy_in_default_stream"] = "1";
providerOptionsDict["cudnn_conv_use_max_workspace"] = "1";

cudaProviderOptions.UpdateOptions(providerOptionsDict);

SessionOptions options = SessionOptions.MakeSessionOptionWithCudaProvider(cudaProviderOptions); // Dispose this finally
```
28 changes: 26 additions & 2 deletions docs/performance/tune-performance.md
Original file line number Diff line number Diff line change
Expand Up @@ -300,11 +300,35 @@ providers = [("CUDAExecutionProvider", {"cudnn_conv_use_max_workspace": '1'})]
sess_options = ort.SessionOptions()
sess = ort.InferenceSession("my_conv_heavy_fp16_model.onnx", sess_options = sess_options, providers=providers)
```

* C/C++
Support for this provider option will be added in upcoming releases.
```
OrtCUDAProviderOptionsV2* cuda_options = nullptr;
CreateCUDAProviderOptions(&cuda_options);

std::vector<const char*> keys{"cudnn_conv_use_max_workspace"};
std::vector<const char*> values{"1"};

UpdateCUDAProviderOptions(cuda_options, keys.data(), values.data(), 1);

OrtSessionOptions* session_options = /* ... */;
SessionOptionsAppendExecutionProvider_CUDA_V2(session_options, cuda_options);

// Finally, don't forget to release the provider options
ReleaseCUDAProviderOptions(cuda_options);
```

* C#
Support for this provider option will be added in upcoming releases.
```
var cudaProviderOptions = new OrtCUDAProviderOptions(); // Dispose this finally

var providerOptionsDict = new Dictionary<string, string>();
providerOptionsDict["cudnn_conv_use_max_workspace"] = "1";

cudaProviderOptions.UpdateOptions(providerOptionsDict);

SessionOptions options = SessionOptions.MakeSessionOptionWithCudaProvider(cudaProviderOptions); // Dispose this finally
```

## Troubleshooting performance issues

Expand Down