TensorRT 10.0 Release #3766

asfiyab-nvidia · 2024-04-02T17:55:43Z

10.0.0 EA - 2024-04-02

Key Features and Updates:

Samples changes
- Added a sample showcasing weight-stripped engines.
- Added a sample demonstrating the use of custom tactics with IPluginV3.
- Added a sample to showcase plugins with data-dependent output shapes, using IPluginV3.
Parser changes
- Added a new class IParserRefitter that can be used to refit a TensorRT engine with the weights of an ONNX model.
- kNATIVE_INSTANCENORM is now set to ON by default.
- Added support for IPluginV3 interfaces from TensorRT.
- Added support for INT4 quantization.
- Added support for the reduction attribute in ScatterElements.
- Added support for wrap padding mode in Pad
Plugin changes
- A new plugin has been added in compliance with ONNX ScatterElements.
- The TensorRT plugin library no longer has a load-time link dependency on cuBLAS or cuDNN libraries.
- All plugins which relied on cuBLAS/cuDNN handles passed through IPluginV2Ext::attachToContext() have moved to use cuBLAS/cuDNN resources initialized by the plugin library itself. This works by dynamically loading the required cuBLAS/cuDNN library. Additionally, plugins which independently initialized their cuBLAS/cuDNN resources have also moved to dynamically loading the required library. If the respective library is not discoverable through the library path(s), these plugins will not work.
- bertQKVToContextPlugin: Version 2 of this plugin now supports head sizes less than or equal to 32.
- reorgPlugin: Added a version 2 which implements IPluginV2DynamicExt.
- disentangledAttentionPlugin: Fixed a kernel bug.
Demo changes
- HuggingFace demos have been removed. For all users using TensorRT to accelerate Large Language Model inference, please use TensorRT-LLM.
Updated tooling
- Polygraphy v0.49.9
- ONNX-GraphSurgeon v0.5.1
- TensorRT Engine Explorer v0.1.8
Build Containers
- RedHat/CentOS 7.x are no longer officially supported starting with TensorRT 10.0. The corresponding container has been removed from TensorRT-OSS.

Signed-off-by: Asfiya Baig <[email protected]>

asfiyab-nvidia · 2024-04-02T18:23:42Z

@rajeevsrao can you please review

TensorRT 10.0 Release

1637a26

Signed-off-by: Asfiya Baig <[email protected]>

asfiyab-nvidia force-pushed the dev-10.0 branch from 42f3b6a to 1637a26 Compare April 2, 2024 18:02

rajeevsrao approved these changes Apr 2, 2024

View reviewed changes

rajeevsrao merged commit 147005f into NVIDIA:main Apr 3, 2024
1 of 2 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TensorRT 10.0 Release #3766

TensorRT 10.0 Release #3766

asfiyab-nvidia commented Apr 2, 2024

asfiyab-nvidia commented Apr 2, 2024

TensorRT 10.0 Release #3766

TensorRT 10.0 Release #3766

Conversation

asfiyab-nvidia commented Apr 2, 2024

10.0.0 EA - 2024-04-02

asfiyab-nvidia commented Apr 2, 2024