Skip to content
Ren Zeng edited this page Jan 2, 2022 · 9 revisions

Welcome to the tiny-tensorrt wiki!

Some of My Suggestions&&Opinions

  • I work for Nvidia but I don't represent Nvidia here, it's a personal project here and I am really glad that tiny-tensorrt can help you.

  • Why choose TensorRT? well, all I can say is TensorRT has the best performance(most of the time) and it's reliable if you want to do inference on Nvidia's device. About performance, nobody on this earth knows our GPU more than Nvidia, TensorRT have the best kernel implementations for every architecture. eg a simple matrix multiplication, TensorRT will choose the fastest from more than thousands of kernels, some of them are for specific GPU and some of them are even written in assembly code. so if you want the best performance, TensorRT should be your first choice.

  • If you want to use DLA with your network, then TensorRT is your only choice.

  • Any shortcoming on TensorRT? 1. TensorRT still does not support all the onnx operators now and we are working hard to support the operators.

  • When you met any problems or have any questions with TensorRT, Official Developer Guide should be the first material to search/read. Especially Performance Best Practice and Troubleshooting

  • If possible, choose ONNX first, I know there are TF-TRT or torch2trt, but when you meet any problem, they are more difficult to debug.

  • If you want to report a bug on TensorRT, try to reproduce it with trtexec should be your first choice. please attach the onnx model, trtexec log with --verbose, and your env setup.