PointPillars Inference with TensorRT

This repository contains sources and model for pointpillars inference using TensorRT. The model is created with OpenPCDet and modified with onnx_graphsurgeon.

Overall inference has four phases:

Convert points cloud into 4-channle voxels
Extend 4-channel voxels to 10-channel voxel features
Run TensorRT engine to get 3D-detection raw data
Parse bounding box, class type and direction

Model && Data

The demo use the velodyne data from KITTI Dataset. The onnx file can be converted from pre-trained model with given script under "./tool".

Prerequisites

To build the pointpillars inference, TensorRT with PillarScatter layer and CUDA are needed. PillarScatter layer plugin is already implemented as a plugin for TRT in the demo.

Environments

Nvidia Jetson Xavier/Orin + Jetpack 5.0
CUDA 11.4 + cuDNN 8.3.2 + TensorRT 8.4.0

Compile && Run

$ mkdir build && cd build
$ cmake .. && make -j$(nproc)
$ ./demo

Performance in FP16

Set Jetson to power mode with "sudo nvpmodel -m 0 && sudo jetson_clocks"

| Function(unit:ms) | Xavier | Orin   |
| ----------------- | ------ | ------ |
| GenerateVoxels    | 0.29   | 0.14   |
| GenerateFeatures  | 0.31   | 0.15   |
| Inference         | 20.21  | 9.12   |
| Postprocessing    | 3.38   | 1.77   |
| Overall           | 24.19  | 11.18  |

3D detection performance of moderate difficulty on the val set of KITTI dataset.

|                   | Car@R11 | Pedestrian@R11 | Cyclist@R11  | 
| ----------------- | --------| -------------- | ------------ |
| CUDA-PointPillars | 77.02   | 51.65          | 62.24        |
| OpenPCDet         | 77.28   | 52.29          | 62.68        |

Note

GenerateVoxels has random output since GPU processes all points simultaneously while points selection for a voxel is random.
The demo will cache the onnx file to improve performance. If a new onnx will be used, please remove the cache file in "./model".
MAX_VOXELS in params.h is used to allocate cache during inference. Decrease the value to save memory.

References

PointPillars: Fast Encoders for Object Detection from Point Clouds

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
OpenPCDet @ 846cf3e		OpenPCDet @ 846cf3e
data		data
docs		docs
eval		eval
include		include
model		model
src		src
tool		tool
.gitattributes		.gitattributes
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
main.cpp		main.cpp

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PointPillars Inference with TensorRT

Model && Data

Prerequisites

Environments

Compile && Run

Performance in FP16

Note

References

About

Releases

Packages

Languages

License

perimeter-inc/CUDA-PointPillars

Folders and files

Latest commit

History

Repository files navigation

PointPillars Inference with TensorRT

Model && Data

Prerequisites

Environments

Compile && Run

Performance in FP16

Note

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages