Inference Time Issue #66

cathy-kim · 2019-02-11T06:06:55Z

@Robert-JunWang Hi, Thanks to your work.

With the merged caffe model, I only got 48fps on TX2 + TensorRT4.1.4,
it is slower than mobileNet-SSD(about 54fps).
I've already optimized my TX2 with jetson_clocks.sh.
(I think I have already done what you suggested on Issue #43

Would you tell me how did you run 70+fps?
Thanks

Shreeyak · 2019-02-15T15:23:02Z

@Robert-JunWang Could you also tell us how you get ~100 fps with yolov3-tiny? I'm running yolov3tiny-320, in tensorflow, without tensorrt and only getting ~12 fps. Clocks are maxed on my tx2. I don't understand the 10x performance gap!

Robert-JunWang · 2019-02-24T14:52:50Z

@Robert-JunWang Hi, Thanks to your work.

With the merged caffe model, I only got 48fps on TX2 + TensorRT4.1.4,
it is slower than mobileNet-SSD(about 54fps).
I've already optimized my TX2 with jetson_clocks.sh.
(I think I have already done what you suggested on Issue #43

Would you tell me how did you run 70+fps?
Thanks

That speed does not include the post-processing part(decode bounding boxes and nms). The post-processing part can be done on CPU asynchronously. The real E2E speed is almost the same as the one I report. Both mobilenet+ssd and Pelee runs over 70 FPS on FP32 mode.

Robert-JunWang · 2019-02-24T15:17:44Z

@Robert-JunWang Could you also tell us how you get ~100 fps with yolov3-tiny? I'm running yolov3tiny-320, in tensorflow, without tensorrt and only getting ~12 fps. Clocks are maxed on my tx2. I don't understand the 10x performance gap!

I created a Caffe model of tinyyolov3 by myself and tested the speed with the random weights. The speed also does not include the post-processing part. The input dim is 416, not 320. The only difference between my model and the original paper is that I use Relu instead of leaky relu. But I do not think this would make much difference in speed. TinyYOLOv3 can benefit from FP16 inference as well. The model on FP16 mode is about 1.8 times to 2 times faster than FP32 mode.

I never compare the speed of the tensorflow and tensorrt. But I do not think there is a 10 times gap between these two frameworks. You can remove the preprocessing and postprocessing part of your model and see whether there is any difference.

Shreeyak · 2019-02-24T21:06:49Z

Oh, thank you for the explanations, that makes a lot more sense now! I should also take a look at how to do the post-processing asynchronously. Would you happen to have a repo/post/example of how to do that?

Would you happen to have any benchmarks of fps including the post-processing?

dbellan · 2019-02-28T16:02:32Z

@ginn24 Could you please tell me how you defined the detection_out layer plugin?

I populate the Plugin factory with
mDetection_out = std::unique_ptr<INvPlugin, decltype(nvPluginDeleter)>
(createSSDDetectionOutputPlugin(params), nvPluginDeleter);

but during the building of the Engine, I have the following error:
http://NvPluginSSD.cu:795 virtual void nvinfer1::plugin::DetectionOutput::configure(const nvinfer1::Dims*, int, const nvinfer1::Dims*, int, int): Assertion `numPriorsnumLocClasses4 == inputDims[param.inputOrder[0]].d[0]' failed.

Usually this error is due to a wrong name of the layer or the wrong structur of params.InputOrder, but everything is correct. I feel that it is something related to how I created the Plugin. May I ask how did you do?

cathy-kim · 2019-03-05T13:03:07Z

@dbellan I just uploaded my Pelee-TensorRT code. You can checkout here.
https://github.com/ginn24/Pelee-TensorRT

The version of code visualizes detection_out. This code doesn't include measuring inference time.
If you need to check the inference time, you should implement code for GPU time.

dbellan · 2019-03-07T12:13:31Z

Thank you @ginn24. I'll have a look

cathy-kim closed this as completed Mar 5, 2019

cathy-kim reopened this Mar 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Inference Time Issue #66

Inference Time Issue #66

cathy-kim commented Feb 11, 2019 •

edited

Loading

Shreeyak commented Feb 15, 2019

Robert-JunWang commented Feb 24, 2019

Robert-JunWang commented Feb 24, 2019

Shreeyak commented Feb 24, 2019

dbellan commented Feb 28, 2019

cathy-kim commented Mar 5, 2019 •

edited

Loading

dbellan commented Mar 7, 2019

Inference Time Issue #66

Inference Time Issue #66

Comments

cathy-kim commented Feb 11, 2019 • edited Loading

Shreeyak commented Feb 15, 2019

Robert-JunWang commented Feb 24, 2019

Robert-JunWang commented Feb 24, 2019

Shreeyak commented Feb 24, 2019

dbellan commented Feb 28, 2019

cathy-kim commented Mar 5, 2019 • edited Loading

dbellan commented Mar 7, 2019

cathy-kim commented Feb 11, 2019 •

edited

Loading

cathy-kim commented Mar 5, 2019 •

edited

Loading