Releases: NVIDIA/TensorRT
Releases · NVIDIA/TensorRT
21.09
21.08
Commit used by the 21.08 TensorRT NGC container.
Changelog
Added
- Add demoBERT and demoBERT-MT (sparsity) benchmark data for TensorRT 8.
- Added example python notebooks
Changed
- Updated samples and plugins directory structure
- Updates to TensorRT developer tools
- README fix to update build command for native aarch64 builds.
Removed
- N/A
21.07
TensorRT OSS v8.0.1
TensorRT OSS release corresponding to TensorRT 8.0.1.6 GA release.
Added
- Added support for the following ONNX operators:
Celu
,CumSum
,EyeLike
,GatherElements
,GlobalLpPool
,GreaterOrEqual
,LessOrEqual
,LpNormalization
,LpPool
,ReverseSequence
, andSoftmaxCrossEntropyLoss
. - Rehauled
Resize
ONNX operator, now fully supporting the following modes:- Coordinate Transformation modes:
half_pixel
,pytorch_half_pixel
,tf_half_pixel_for_nn
,asymmetric
, andalign_corners
. - Modes:
nearest
,linear
. - Nearest Modes:
floor
,ceil
,round_prefer_floor
,round_prefer_ceil
.
- Coordinate Transformation modes:
- Added support for multi-input ONNX
ConvTranpose
operator. - Added support for 3D spatial dimensions in ONNX
InstanceNormalization
. - Added support for generic 2D padding in ONNX.
- ONNX
QuantizeLinear
andDequantizeLinear
operators leverageIQuantizeLayer
andIDequantizeLayer
.- Added support for tensor scales.
- Added support for per-axis quantization.
- Added
EfficientNMS_TRT
,EfficientNMS_ONNX_TRT
plugins and experimental support for ONNXNonMaxSuppression
operator. - Added
ScatterND
plugin. - Added TensorRT QuickStart Guide.
- Added new samples: engine_refit_onnx_bidaf builds an engine from ONNX BiDAF model and refits engine with new weights, efficientdet and efficientnet samples for demonstrating Object Detection using TensorRT.
- Added support for Ubuntu20.04 and RedHat/CentOS 8.3.
- Added Python 3.9 support.
Changed
- Update Polygraphy to v0.30.3.
- Update ONNX-GraphSurgeon to v0.3.10.
- Update Pytorch Quantization toolkit to v2.1.0.
- Notable TensorRT API updates
- TensorRT now declares API’s with the
noexcept
keyword. All TensorRT classes that an application inherits from (such as IPluginV2) must guarantee that methods called by TensorRT do not throw uncaught exceptions, or the behavior is undefined. - Destructors for classes with
destroy()
methods were previously protected. They are now public, enabling use of smart pointers for these classes. Thedestroy()
methods are deprecated.
- TensorRT now declares API’s with the
- Moved
RefitMap
API from ONNX parser to core TensorRT. - Various bugfixes for plugins, samples and ONNX parser.
- Port demoBERT to tensorflow2 and update UFF samples to leverage nvidia-tensorflow1 container.
Removed
IPlugin
andIPluginFactory
interfaces were deprecated in TensorRT 6.0 and have been removed in TensorRT 8.0. We recommend that you write new plugins or refactor existing ones to target theIPluginV2DynamicExt
andIPluginV2IOExt
interfaces. For more information, refer to Migrating Plugins From TensorRT 6.x Or 7.x To TensorRT 8.x.x.- For plugins based on
IPluginV2DynamicExt
andIPluginV2IOExt
, certain methods with legacy function signatures (derived fromIPluginV2
andIPluginV2Ext
base classes) which were deprecated and marked for removal in TensorRT 8.0 will no longer be available.
- For plugins based on
- Removed
samplePlugin
since it showcased IPluginExt interface, which is no longer supported in TensorRT 8.0. - Removed
sampleMovieLens
andsampleMovieLensMPS
. - Removed Dockerfile for Ubuntu 16.04. TensorRT 8.0 debians for Ubuntu 16.04 require python 3.5 while minimum required python version for TensorRT OSS is 3.6.
- Removed support for PowerPC builds, consistent with TensorRT GA releases.
Notes
- We had deprecated the Caffe Parser and UFF Parser in TensorRT 7.0. They are still tested and functional in TensorRT 8.0, however, we plan to remove the support in a future release. Ensure you migrate your workflow to use
tf2onnx
,keras2onnx
or TensorFlow-TensorRT (TF-TRT).
Signed-off-by: Rajeev Rao [email protected]
21.06
Commit used by the 21.06 TensorRT NGC container
Changelog
Added
- Add switch for batch-agnostic mode in NMS plugin
- Add missing model.py in
uff_custom_plugin
sample
Changed
- Update to Polygraphy v0.29.2
- Update to ONNX-GraphSurgeon v0.3.9
- Fix numerical errors for float type in NMS/batchedNMS plugins
- Update demoBERT input dimensions to match Triton requirement #1051
- Optimize TLT MaskRCNN plugins:
- enable fp16 precision in multilevelCropAndResizePlugin and multilevelProposeROIPlugin
- Algorithms optimization for NMS kernels and ROIAlign kernel
- Fix invalid cuda config issue when bs is larger than 32
- Fix issues found on Jetson NANO
Removed
- Removed fcplugin from demoBERT to improve inference latency on GA100/Turing
21.05
Commit used by the 21.05 TensorRT NGC container
Changelog
Added
- Extended support for ONNX operator
InstanceNormalization
to 5D tensors - Support negative indices in ONNX
Gather
operator - Add support for importing ONNX double-typed weights as float
- ONNX-GraphSurgeon (v0.3.7) support for models with externally stored weights
Changed
- Update ONNX-TensorRT to 21.05
- Relicense ONNX-TensorRT under Apache2
- demoBERT builder fixes for multi-batch
- Speedup demoBERT build using global timing cache and disable cuDNN tactics
- Standardize python package versions across OSS samples
- Bugfixes in multilevelProposeROI and bertQKV plugin
- Fix memleaks in samples logger
21.04
Commit used by the 21.04 TensorRT NGC container
Changelog
Added
- SM86 kernels for BERT MHA plugin
- Added opset13 support for
SoftMax
,LogSoftmax
,Squeeze
, andUnsqueeze
. - Added support for the
EyeLike
andGatherElements
operators.
Changed
- Updated TensorRT version to v7.2.3.4.
- Update to ONNX-TensorRT 21.03
- ONNX-GraphSurgeon (v0.3.4) - updates fold_constants to correctly exit early.
- Set default CUDA_INSTALL_DIR #798
- Plugin bugfixes, qkv kernels for sm86
- Fixed GroupNorm CMakeFile for cu sources #1083
- Permit groupadd with non-unique GID in build containers #1091
- Avoid
reinterpret_cast
#146 - Clang-format plugins and samples
- Avoid arithmetic on void pointer in multilevelProposeROIPlugin.cpp #1028
- Update BERT plugin documentation.
Removed
- Removes extra terminate call in InstanceNorm
21.03
Commit used by the 21.03 TensorRT NGC container
Changelog
Added
- Optimized FP16 NMS/batchedNMS plugins with n-bit radix sort and based on
IPluginV2DynamicExt
ProposalDynamic
andCropAndResizeDynamic
plugins based onIPluginV2DynamicExt
Changed
- ONNX-TensorRT v21.03 update
- ONNX-GraphSurgeon v0.3.3 update
- Bugfix for
scaledSoftmax
kernel #1096
Removed
- N/A
21.02
Commit used by the 21.02 TensorRT NGC container
Changelog
Added
- TensorRT Python API bindings
- TensorRT Python samples
- FP16 support to batchedNMSPlugin #1002
- Configurable input size for TLT MaskRCNN Plugin #986
Changed
- TensorRT version updated to 7.2.2.3
- ONNX-TensorRT v21.02 update
- Polygraphy v0.21.1 update
- PyTorch-Quantization Toolkit v2.1.0 update
- Documentation update, ONNX opset 13 support, ResNet example
- ONNX-GraphSurgeon v0.28 update
- demoBERT builder updated to work with Tensorflow2 (in compatibility mode)
- Refactor Dockerfiles for OSS container
Removed
- N/A
20.12
Commit used by the 20.12 TensorRT NGC container
Changelog
Added
- Add configurable input size for TLT MaskRCNN Plugin
Changed
- Update symbol export map for plugins
- Correctly use channel dimension when creating Prelu node
- Fix Jetson cross compilation CMakefile
Removed
- N/A