Skip to content

Commit 3f05f74

Browse files
authored
Nvidia TensorRT detector (blakeblackshear#4718)
* Initial WIP dockerfile and scripts to add tensorrt support * Add tensorRT detector * WIP attempt to install TensorRT 8.5 * Updates to detector for cuda python library * TensorRT Cuda library rework WIP Does not run * Fixes from rebase to detector factory * Fix parsing output memory pointer * Handle TensorRT logs with the python logger * Use non-async interface and convert input data to float32. Detection runs without error. * Make TensorRT a separate build from the base Frigate image. * Add script and documentation for generating TRT Models * Add support for TensorRT devcontainer * Add labelmap to trt model script and docs. Cleanup of old scripts. * Update detect to normalize input tensor using model input type * Add config for selecting GPU. Fix Async inference. Update documentation. * Update some CUDA libraries to clean up version warning * Add CI stage to build TensorRT tag * Add note in docs for image tag and model support
1 parent e3ec292 commit 3f05f74

File tree

9 files changed

+515
-16
lines changed

9 files changed

+515
-16
lines changed

.github/workflows/ci.yml

+12
Original file line numberDiff line numberDiff line change
@@ -36,7 +36,19 @@ jobs:
3636
context: .
3737
push: true
3838
platforms: linux/amd64,linux/arm64,linux/arm/v7
39+
target: frigate
3940
tags: |
4041
ghcr.io/blakeblackshear/frigate:${{ github.ref_name }}-${{ env.SHORT_SHA }}
4142
cache-from: type=gha
4243
cache-to: type=gha,mode=max
44+
- name: Build and push TensorRT
45+
uses: docker/build-push-action@v3
46+
with:
47+
context: .
48+
push: true
49+
platforms: linux/amd64
50+
target: frigate-tensorrt
51+
tags: |
52+
ghcr.io/blakeblackshear/frigate:${{ github.ref_name }}-${{ env.SHORT_SHA }}-tensorrt
53+
cache-from: type=gha
54+
cache-to: type=gha,mode=max

Dockerfile

+27-4
Original file line numberDiff line numberDiff line change
@@ -71,6 +71,15 @@ WORKDIR /rootfs/usr/local/go2rtc/bin
7171
RUN wget -qO go2rtc "https://github.com/AlexxIT/go2rtc/releases/download/v0.1-rc.5/go2rtc_linux_${TARGETARCH}" \
7272
&& chmod +x go2rtc
7373

74+
75+
####
76+
#
77+
# OpenVino Support
78+
#
79+
# 1. Download and convert a model from Intel's Public Open Model Zoo
80+
# 2. Build libUSB without udev to handle NCS2 enumeration
81+
#
82+
####
7483
# Download and Convert OpenVino model
7584
FROM base_amd64 AS ov-converter
7685
ARG DEBIAN_FRONTEND
@@ -115,8 +124,6 @@ RUN /bin/mkdir -p '/usr/local/lib' && \
115124
/usr/bin/install -c -m 644 libusb-1.0.pc '/usr/local/lib/pkgconfig' && \
116125
ldconfig
117126

118-
119-
120127
FROM wget AS models
121128

122129
# Get model and labels
@@ -160,7 +167,8 @@ RUN apt-get -qq update \
160167
libtbb2 libtbb-dev libdc1394-22-dev libopenexr-dev \
161168
libgstreamer-plugins-base1.0-dev libgstreamer1.0-dev \
162169
# scipy dependencies
163-
gcc gfortran libopenblas-dev liblapack-dev
170+
gcc gfortran libopenblas-dev liblapack-dev && \
171+
rm -rf /var/lib/apt/lists/*
164172

165173
RUN wget -q https://bootstrap.pypa.io/get-pip.py -O get-pip.py \
166174
&& python3 get-pip.py "pip"
@@ -176,6 +184,10 @@ RUN pip3 install -r requirements.txt
176184
COPY requirements-wheels.txt /requirements-wheels.txt
177185
RUN pip3 wheel --wheel-dir=/wheels -r requirements-wheels.txt
178186

187+
# Add TensorRT wheels to another folder
188+
COPY requirements-tensorrt.txt /requirements-tensorrt.txt
189+
RUN mkdir -p /trt-wheels && pip3 wheel --wheel-dir=/trt-wheels -r requirements-tensorrt.txt
190+
179191

180192
# Collect deps in a single layer
181193
FROM scratch AS deps-rootfs
@@ -283,7 +295,18 @@ COPY migrations migrations/
283295
COPY --from=web-build /work/dist/ web/
284296

285297
# Frigate final container
286-
FROM deps
298+
FROM deps AS frigate
287299

288300
WORKDIR /opt/frigate/
289301
COPY --from=rootfs / /
302+
303+
# Frigate w/ TensorRT Support as separate image
304+
FROM frigate AS frigate-tensorrt
305+
RUN --mount=type=bind,from=wheels,source=/trt-wheels,target=/deps/trt-wheels \
306+
pip3 install -U /deps/trt-wheels/*.whl
307+
308+
# Dev Container w/ TRT
309+
FROM devcontainer AS devcontainer-trt
310+
311+
RUN --mount=type=bind,from=wheels,source=/trt-wheels,target=/deps/trt-wheels \
312+
pip3 install -U /deps/trt-wheels/*.whl

Makefile

+11-6
Original file line numberDiff line numberDiff line change
@@ -10,22 +10,27 @@ version:
1010
echo 'VERSION = "$(VERSION)-$(COMMIT_HASH)"' > frigate/version.py
1111

1212
local: version
13-
docker buildx build --tag frigate:latest --load .
13+
docker buildx build --target=frigate --tag frigate:latest --load .
14+
15+
local-trt: version
16+
docker buildx build --target=frigate-tensorrt --tag frigate:latest-tensorrt --load .
1417

1518
amd64:
16-
docker buildx build --platform linux/amd64 --tag $(IMAGE_REPO):$(VERSION)-$(COMMIT_HASH) .
19+
docker buildx build --platform linux/amd64 --target=frigate --tag $(IMAGE_REPO):$(VERSION)-$(COMMIT_HASH) .
20+
docker buildx build --platform linux/amd64 --target=frigate-tensorrt --tag $(IMAGE_REPO):$(VERSION)-$(COMMIT_HASH)-tensorrt .
1721

1822
arm64:
19-
docker buildx build --platform linux/arm64 --tag $(IMAGE_REPO):$(VERSION)-$(COMMIT_HASH) .
23+
docker buildx build --platform linux/arm64 --target=frigate --tag $(IMAGE_REPO):$(VERSION)-$(COMMIT_HASH) .
2024

2125
armv7:
22-
docker buildx build --platform linux/arm/v7 --tag $(IMAGE_REPO):$(VERSION)-$(COMMIT_HASH) .
26+
docker buildx build --platform linux/arm/v7 --target=frigate --tag $(IMAGE_REPO):$(VERSION)-$(COMMIT_HASH) .
2327

2428
build: version amd64 arm64 armv7
25-
docker buildx build --platform linux/arm/v7,linux/arm64/v8,linux/amd64 --tag $(IMAGE_REPO):$(VERSION)-$(COMMIT_HASH) .
29+
docker buildx build --platform linux/arm/v7,linux/arm64/v8,linux/amd64 --target=frigate --tag $(IMAGE_REPO):$(VERSION)-$(COMMIT_HASH) .
2630

2731
push: build
28-
docker buildx build --push --platform linux/arm/v7,linux/arm64/v8,linux/amd64 --tag $(IMAGE_REPO):${GITHUB_REF_NAME}-$(COMMIT_HASH) .
32+
docker buildx build --push --platform linux/arm/v7,linux/arm64/v8,linux/amd64 --target=frigate --tag $(IMAGE_REPO):${GITHUB_REF_NAME}-$(COMMIT_HASH) .
33+
docker buildx build --push --platform linux/amd64 --target=frigate-tensorrt --tag $(IMAGE_REPO):${GITHUB_REF_NAME}-$(COMMIT_HASH)-tensorrt .
2934

3035
run: local
3136
docker run --rm --publish=5000:5000 --volume=${PWD}/config/config.yml:/config/config.yml frigate:latest

docker-compose.yml

+10
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,15 @@ services:
1111
shm_size: "256mb"
1212
build:
1313
context: .
14+
# Use target devcontainer-trt for TensorRT dev
1415
target: devcontainer
16+
deploy:
17+
resources:
18+
reservations:
19+
devices:
20+
- driver: nvidia
21+
count: 1
22+
capabilities: [gpu]
1523
devices:
1624
- /dev/bus/usb:/dev/bus/usb
1725
# - /dev/dri:/dev/dri # for intel hwaccel, needs to be updated for your hardware
@@ -21,6 +29,8 @@ services:
2129
- /etc/localtime:/etc/localtime:ro
2230
- ./config/config.yml:/config/config.yml:ro
2331
- ./debug:/media/frigate
32+
# Create the trt-models folder using the documented method of generating TRT models
33+
# - ./debug/trt-models:/trt-models
2434
- /dev/bus/usb:/dev/bus/usb
2535
mqtt:
2636
container_name: mqtt

docker/tensorrt_models.sh

+37
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,37 @@
1+
#!/bin/bash
2+
3+
set -euxo pipefail
4+
5+
CUDA_HOME=/usr/local/cuda
6+
LD_LIBRARY_PATH=${LD_LIBRARY_PATH}:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64
7+
OUTPUT_FOLDER=/tensorrt_models
8+
echo "Generating the following TRT Models: ${YOLO_MODELS:="yolov4-tiny-288,yolov4-tiny-416,yolov7-tiny-416"}"
9+
10+
# Create output folder
11+
mkdir -p ${OUTPUT_FOLDER}
12+
13+
# Install packages
14+
pip install --upgrade pip && pip install onnx==1.9.0 protobuf==3.20.3
15+
16+
# Clone tensorrt_demos repo
17+
git clone --depth 1 https://github.com/yeahme49/tensorrt_demos.git /tensorrt_demos
18+
19+
# Build libyolo
20+
cd /tensorrt_demos/plugins && make all
21+
cp libyolo_layer.so ${OUTPUT_FOLDER}/libyolo_layer.so
22+
23+
# Download yolo weights
24+
cd /tensorrt_demos/yolo && ./download_yolo.sh
25+
26+
# Build trt engine
27+
cd /tensorrt_demos/yolo
28+
29+
for model in ${YOLO_MODELS//,/ }
30+
do
31+
python3 yolo_to_onnx.py -m ${model}
32+
python3 onnx_to_tensorrt.py -m ${model}
33+
cp /tensorrt_demos/yolo/${model}.trt ${OUTPUT_FOLDER}/${model}.trt;
34+
done
35+
36+
# Download Labelmap
37+
wget -q https://github.com/openvinotoolkit/open_model_zoo/raw/master/data/dataset_classes/coco_91cl.txt -O ${OUTPUT_FOLDER}/coco_91cl.txt

docs/docs/configuration/detectors.md

+91-6
Original file line numberDiff line numberDiff line change
@@ -3,11 +3,10 @@ id: detectors
33
title: Detectors
44
---
55

6-
Frigate provides the following builtin detector types: `cpu`, `edgetpu`, and `openvino`. By default, Frigate will use a single CPU detector. Other detectors may require additional configuration as described below. When using multiple detectors they will run in dedicated processes, but pull from a common queue of detection requests from across all cameras.
7-
8-
**Note**: There is not yet support for Nvidia GPUs to perform object detection with tensorflow. It can be used for ffmpeg decoding, but not object detection.
6+
Frigate provides the following builtin detector types: `cpu`, `edgetpu`, `openvino`, and `tensorrt`. By default, Frigate will use a single CPU detector. Other detectors may require additional configuration as described below. When using multiple detectors they will run in dedicated processes, but pull from a common queue of detection requests from across all cameras.
97

108
## CPU Detector (not recommended)
9+
1110
The CPU detector type runs a TensorFlow Lite model utilizing the CPU without hardware acceleration. It is recommended to use a hardware accelerated detector type instead for better performance. To configure a CPU based detector, set the `"type"` attribute to `"cpu"`.
1211

1312
The number of threads used by the interpreter can be specified using the `"num_threads"` attribute, and defaults to `3.`
@@ -60,6 +59,7 @@ detectors:
6059
```
6160

6261
### Native Coral (Dev Board)
62+
6363
_warning: may have [compatibility issues](https://github.com/blakeblackshear/frigate/issues/1706) after `v0.9.x`_
6464

6565
```yaml
@@ -99,7 +99,7 @@ The OpenVINO detector type runs an OpenVINO IR model on Intel CPU, GPU and VPU h
9999

100100
The OpenVINO device to be used is specified using the `"device"` attribute according to the naming conventions in the [Device Documentation](https://docs.openvino.ai/latest/openvino_docs_OV_UG_Working_with_devices.html). Other supported devices could be `AUTO`, `CPU`, `GPU`, `MYRIAD`, etc. If not specified, the default OpenVINO device will be selected by the `AUTO` plugin.
101101

102-
OpenVINO is supported on 6th Gen Intel platforms (Skylake) and newer. A supported Intel platform is required to use the `GPU` device with OpenVINO. The `MYRIAD` device may be run on any platform, including Arm devices. For detailed system requirements, see [OpenVINO System Requirements](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/system-requirements.html)
102+
OpenVINO is supported on 6th Gen Intel platforms (Skylake) and newer. A supported Intel platform is required to use the `GPU` device with OpenVINO. The `MYRIAD` device may be run on any platform, including Arm devices. For detailed system requirements, see [OpenVINO System Requirements](https://www.intel.com/content/www/us/en/developer/tools/openvino-toolkit/system-requirements.html)
103103

104104
An OpenVINO model is provided in the container at `/openvino-model/ssdlite_mobilenet_v2.xml` and is used by this detector type by default. The model comes from Intel's Open Model Zoo [SSDLite MobileNet V2](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/public/ssdlite_mobilenet_v2) and is converted to an FP16 precision IR model. Use the model configuration shown below when using the OpenVINO detector.
105105

@@ -121,7 +121,7 @@ model:
121121

122122
### Intel NCS2 VPU and Myriad X Setup
123123

124-
Intel produces a neural net inference accelleration chip called Myriad X. This chip was sold in their Neural Compute Stick 2 (NCS2) which has been discontinued. If intending to use the MYRIAD device for accelleration, additional setup is required to pass through the USB device. The host needs a udev rule installed to handle the NCS2 device.
124+
Intel produces a neural net inference accelleration chip called Myriad X. This chip was sold in their Neural Compute Stick 2 (NCS2) which has been discontinued. If intending to use the MYRIAD device for accelleration, additional setup is required to pass through the USB device. The host needs a udev rule installed to handle the NCS2 device.
125125

126126
```bash
127127
sudo usermod -a -G users "$(whoami)"
@@ -139,11 +139,96 @@ Additionally, the Frigate docker container needs to run with the following confi
139139
```bash
140140
--device-cgroup-rule='c 189:\* rmw' -v /dev/bus/usb:/dev/bus/usb
141141
```
142+
142143
or in your compose file:
143144

144145
```yml
145146
device_cgroup_rules:
146-
- 'c 189:* rmw'
147+
- "c 189:* rmw"
147148
volumes:
148149
- /dev/bus/usb:/dev/bus/usb
149150
```
151+
152+
## NVidia TensorRT Detector
153+
154+
NVidia GPUs may be used for object detection using the TensorRT libraries. Due to the size of the additional libraries, this detector is only provided in images with the `-tensorrt` tag suffix. This detector is designed to work with Yolo models for object detection.
155+
156+
### Minimum Hardware Support
157+
158+
The TensorRT detector uses the 11.x series of CUDA libraries which have minor version compatibility. The minimum driver version on the host system must be `>=450.80.02`. Also the GPU must support a Compute Capability of `5.0` or greater. This generally correlates to a Maxwell-era GPU or newer, check the NVIDIA GPU Compute Capability table linked below.
159+
160+
> **TODO:** NVidia claims support on compute 3.5 and 3.7, but marks it as deprecated. This would have some, but not all, Kepler GPUs as possibly working. This needs testing before making any claims of support.
161+
162+
There are improved capabilities in newer GPU architectures that TensorRT can benefit from, such as INT8 operations and Tensor cores. The features compatible with your hardware will be optimized when the model is converted to a trt file. Currently the script provided for generating the model provides a switch to enable/disable FP16 operations. If you wish to use newer features such as INT8 optimization, more work is required.
163+
164+
#### Compatibility References:
165+
166+
[NVIDIA TensorRT Support Matrix](https://docs.nvidia.com/deeplearning/tensorrt/archives/tensorrt-841/support-matrix/index.html)
167+
168+
[NVIDIA CUDA Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/index.html)
169+
170+
[NVIDIA GPU Compute Capability](https://developer.nvidia.com/cuda-gpus)
171+
172+
### Generate Models
173+
174+
The models used for TensorRT must be preprocessed on the same hardware platform that they will run on. This means that each user must run additional setup to generate these model files for the TensorRT library. A script is provided that will build several common models.
175+
176+
To generate the model files, create a new folder to save the models, download the script, and launch a docker container that will run the script.
177+
178+
```bash
179+
mkdir trt-models
180+
wget https://raw.githubusercontent.com/blakeblackshear/frigate/nvidia-detector/docker/tensorrt_models.sh
181+
chmod +x tensorrt_models.sh
182+
docker run --gpus=all --rm -it -v `pwd`/trt-models:/tensorrt_models -v `pwd`/tensorrt_models.sh:/tensorrt_models.sh nvcr.io/nvidia/tensorrt:22.07-py3 /tensorrt_models.sh
183+
```
184+
185+
The `trt-models` folder can then be mapped into your frigate container as `trt-models` and the models referenced from the config.
186+
187+
If your GPU does not support FP16 operations, you can pass the environment variable `-e USE_FP16=False` to the `docker run` command to disable it.
188+
189+
Specific models can be selected by passing an environment variable to the `docker run` command. Use the form `-e YOLO_MODELS=yolov4-416,yolov4-tiny-416` to select one or more model names. The models available are shown below.
190+
191+
```
192+
yolov3-288
193+
yolov3-416
194+
yolov3-608
195+
yolov3-spp-288
196+
yolov3-spp-416
197+
yolov3-spp-608
198+
yolov3-tiny-288
199+
yolov3-tiny-416
200+
yolov4-288
201+
yolov4-416
202+
yolov4-608
203+
yolov4-csp-256
204+
yolov4-csp-512
205+
yolov4-p5-448
206+
yolov4-p5-896
207+
yolov4-tiny-288
208+
yolov4-tiny-416
209+
yolov4x-mish-320
210+
yolov4x-mish-640
211+
yolov7-tiny-288
212+
yolov7-tiny-416
213+
```
214+
215+
### Configuration Parameters
216+
217+
The TensorRT detector can be selected by specifying `tensorrt` as the model type. The GPU will need to be passed through to the docker container using the same methods described in the [Hardware Acceleration](hardware_acceleration.md#nvidia-gpu) section. If you pass through multiple GPUs, you can select which GPU is used for a detector with the `device` configuration parameter. The `device` parameter is an integer value of the GPU index, as shown by `nvidia-smi` within the container.
218+
219+
The TensorRT detector uses `.trt` model files that are located in `/trt-models/` by default. These model file path and dimensions used will depend on which model you have generated.
220+
221+
```yaml
222+
detectors:
223+
tensorrt:
224+
type: tensorrt
225+
device: 0 #This is the default, select the first GPU
226+
227+
model:
228+
path: /trt-models/yolov7-tiny-416.trt
229+
labelmap_path: /trt-models/coco_91cl.txt
230+
input_tensor: nchw
231+
input_pixel_format: rgb
232+
width: 416
233+
height: 416
234+
```

0 commit comments

Comments
 (0)