Skip to content

Latest commit

 

History

History

image_classification

Image classification example

The example script infer.py is for benchmarking and validating inference on image classification models using TF-TRT in TensorFlow 2.0.

You can enable TF-TRT integration by passing the --use_tftrt flag to the script. This causes the script to apply TensorRT inference optimization to speed up execution for portions of the model's graph where supported, and to fall back on native TensorFlow for layers and operations which are not supported. See Accelerating Inference In TensorFlow With TensorRT User Guide for more information.

When using the TF-TRT integration flag, you can use the precision option (--precision) to control precision. float32 is the default (--precision fp32) with float16 (--precision fp16) or int8 (--precision int8) allowing further performance improvements.

int8 mode requires a calibration step (which is done automatically), but you also must specificy the directory in which the calibration dataset is stored with --calib_data_dir /imagenet_validation_data. You can use the same data for both calibration and validation.

Data

The example script supports either using a dataset (for validation mode - TFRecord format, for benchmark mode - jpeg format) or using autogenerated synthetic data (with the --use_synthetic_data flag). If you use TFRecord files, the script assumes that the TFRecords are named according to the pattern: validation-*-of-00128.

To download and process the ImageNet data, you can:

  • Use the scripts provided in the nvidia-examples/build_imagenet_data directory in the NVIDIA TensorFlow Docker container workspace directory. Follow the README file in that directory for instructions on how to use these scripts.

or

  • Use the scripts provided by TF Slim in the tensorflow/models repository at research/slim. Consult the README file under `research/slim for instructions on how to use these scripts. Also please note that these scripts download both the training and validation sets, and this example only requires the validation set.

Also see Obtaining The ImageNet Data for more information.

If the above procedure fails in TF2.x, build the dataset with TF1.x (or a container that comes with TF1.x), and then use that dataset in TFv2.x.

Usage

The main Python script is infer.py. Assuming that the ImageNet validation data are located under /data/imagenet/train-val-tfrecord, you can evaluate inference with TF-TRT integration using the pre-trained ResNet V1.5 50 model as follows:

python infer.py \
    --data_dir /data/imagenet/train-val-tfrecord \
    --calib_data_dir /data/imagenet/train-val-tfrecord \
    --saved_model_dir /models/resnet_v1.5_50_saved_model/ \
    --model resnet_v1.5_50_tfv2 \
    --num_warmup_iterations 50 \
    --num_calib_batches 128
    --display_every 10 \
    --use_tftrt \
    --optimize_offline \
    --precision INT8 \
    --max_workspace_size $((2**32)) \
    --batch_size 128

Where:

--saved_model_dir: Input model to optimize with TF-TRT

--model: Name of the model (only used to get the right preprocessing)

--data_dir: Path to the ImageNet TFRecord validation files.

--use_tftrt: Convert the graph to a TensorRT graph.

--precision: Precision mode to use, in this case FP16.

--mode: Which mode to use (validation or benchmark). In validation we run inference with accuracy and performance measurments, in benchmark only performance.

Run with --help to see all available options.

Ready to Use Scripts:

1. resnet_v1_50

# Tensorflow - FP32
./models/resnet_v1_50/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/resnet_v1_50/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/resnet_v1_50/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/resnet_v1_50/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/resnet_v1_50/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/resnet_v1_50/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"

2. resnet_v1.5_50_tfv2

# Tensorflow - FP32
./models/resnet_v1.5_50_tfv2/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/resnet_v1.5_50_tfv2/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/resnet_v1.5_50_tfv2/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/resnet_v1.5_50_tfv2/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/resnet_v1.5_50_tfv2/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/resnet_v1.5_50_tfv2/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"

3. resnet_v2_50

# Tensorflow - FP32
./models/resnet_v2_50/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/resnet_v2_50/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/resnet_v2_50/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/resnet_v2_50/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/resnet_v2_50/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/resnet_v2_50/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"

4. inception_v3

# Tensorflow - FP32
./models/inception_v3/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/inception_v3/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/inception_v3/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/inception_v3/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/inception_v3/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/inception_v3/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"

5. inception_v4

# Tensorflow - FP32
./models/inception_v4/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/inception_v4/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/inception_v4/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/inception_v4/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/inception_v4/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/inception_v4/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"

6. mobilenet_v1

# Tensorflow - FP32
./models/mobilenet_v1/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/mobilenet_v1/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/mobilenet_v1/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/mobilenet_v1/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/mobilenet_v1/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/mobilenet_v1/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"

7. mobilenet_v2

# Tensorflow - FP32
./models/mobilenet_v2/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/mobilenet_v2/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/mobilenet_v2/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/mobilenet_v2/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/mobilenet_v2/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/mobilenet_v2/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"

8. nasnet_large

# Tensorflow - FP32
./models/nasnet_large/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/nasnet_large/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/nasnet_large/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/nasnet_large/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/nasnet_large/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/nasnet_large/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"

9. nasnet_mobile

# Tensorflow - FP32
./models/nasnet_mobile/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/nasnet_mobile/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/nasnet_mobile/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/nasnet_mobile/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/nasnet_mobile/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/nasnet_mobile/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"

10. vgg_16

# Tensorflow - FP32
./models/vgg_16/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/vgg_16/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/vgg_16/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/vgg_16/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/vgg_16/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/vgg_16/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"

11. vgg_19

# Tensorflow - FP32
./models/vgg_19/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# Tensorflow - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/vgg_19/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models

# TF-TRT - FP32
./models/vgg_19/run_inference.sh \
    --use_xla --no_tf32 \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - TF32 (identical to FP32 on an NVIDIA Turing GPU or older)
./models/vgg_19/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP32"

# TF-TRT - FP16
./models/vgg_19/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="FP16"

# TF-TRT - INT8
./models/vgg_19/run_inference.sh \
    --use_xla \
    --data_dir=/data/imagenet/train-val-tfrecord --input_saved_model_dir=/models \
    --use_tftrt --precision="INT8"