Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

incorrect 0 INPUT kFLOAT input 3x640x640 1 OUTPUT kFLOAT output 25200x6 shape instead of WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1. INFO: [Implicit Engine Info]: layers num: 4 0 INPUT kFLOAT input 3x640x640 1 OUTPUT kFLOAT boxes 25200x4 2 OUTPUT kINT32 classes 25200x1 3 OUTPUT kFLOAT scores 25200x1 #595

Open
Sanelembuli98 opened this issue Nov 26, 2024 · 13 comments

Comments

@Sanelembuli98
Copy link

The issue described involves TensorRT and YOLOv5 ONNX model export. Specifically:

Incorrect TensorRT Output Shape:
The expected output for a YOLOv5 ONNX model is:

3x640x640 for input.
25200x4, 25200x1, 25200x1 for boxes, scores, and classes respectively.
Instead, the output is concatenated as a single tensor: 25200x6.
This indicates the exported ONNX model is not properly handling YOLOv5's expected separate outputs (boxes, scores, classes).

@marcoslucianops
Copy link
Owner

There's no issue. This repo, now, expects 1 output. The export_yoloV5.py file will generate the correct ONNX file for this repo.

@Sanelembuli98
Copy link
Author

It doesnt work or is not compatible with my deepstream pipelines. Is it possible to specify the commit code for the one that sill outputs 3x640x640, 25200x4, 25200x1, 25200x1

@Sanelembuli98
Copy link
Author

i did modify export_yoloV5 to output 3x640x640 for input.
25200x4, 25200x1, 25200x1

import os
import onnx
import torch
import torch.nn as nn
from models.experimental import attempt_load

class DeepStreamOutput(nn.Module):
def init(self, split_outputs=False):
"""
:param split_outputs: If True, the forward method will return separate tensors
(boxes, scores, classes) instead of a concatenated tensor.
"""
super().init()
self.split_outputs = split_outputs

# def forward(self, x):
#     x = x[0]  # Assuming the first tensor in the list is the output
#     boxes = x[:, :, :4]
#     convert_matrix = torch.tensor(
#         [[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]],
#         dtype=boxes.dtype, device=boxes.device
#     )
#     boxes @= convert_matrix
#     objectness = x[:, :, 4:5]
#     scores, labels = torch.max(x[:, :, 5:], dim=-1, keepdim=True)
#     scores *= objectness

#     if self.split_outputs:
#         return boxes, scores, labels  # Return separate tensors
#         #return boxes, scores, labels.to(boxes.dtype)  # Return separate tensors
#         #return torch.stack([boxes, scores, labels.to(boxes.dtype)], dim=0)
#     else:
#         return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)  # Return concatenated tensor

def forward(self, x):
    x = x[0]  # Assuming the first tensor in the list is the output
    boxes = x[:, :, :4]
    
    # Define the conversion matrix for box adjustments
    convert_matrix = torch.tensor(
        [[1, 0, -0.5, 0],
    [0, 1, 0, -0.5],
    [1, 0, 0.5, 0],
    [0, 1, 0, 0.5]], 
    dtype=boxes.dtype, device=boxes.device
    )
    boxes @= convert_matrix
    # Extract objectness scores and class scores
    objectness = x[:, :, 4:5]  # Shape: [batch_size, num_boxes, 1]
    scores, labels = torch.max(x[:, :, 5:], dim=-1, keepdim=True)  # Shape: [batch_size, num_boxes, 1]
    
    # Adjust scores using objectness
    scores = scores * objectness

    print(f"Boxes shape: {boxes.shape}, Scores shape: {scores.shape}, Labels shape: {labels.shape}")

    if self.split_outputs:
        # Directly return boxes, scores, and labels for separate outputs
        return boxes, scores, labels
    
    else:
        # Ensure labels are converted to match the dtype of `boxes` for concatenation
        labels = labels.to(dtype=boxes.dtype)  # Convert labels to match the dtype of boxes
        return torch.cat([boxes, scores, labels], dim=-1)  # Concatenate along the last dimension

def parse_tensorrt_output(output):
"""
Parses TensorRT output, assuming the shape is [25200, 6].
:param output: Tensor with shape [25200, 6].
:return: boxes, confidence scores, and class predictions as separate tensors.
"""
boxes = output[:, :4]
confidence = output[:, 4:5]
classes = output[:, 5:]
return boxes, confidence, classes

def yolov5_export(weights, device, inplace=True, fuse=True):
model = attempt_load(weights, device=device, inplace=inplace, fuse=fuse)
model.eval()
for k, m in model.named_modules():
if m.class.name == 'Detect':
m.inplace = False
m.dynamic = False
m.export = True
return model

def suppress_warnings():
import warnings
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
warnings.filterwarnings('ignore', category=UserWarning)
warnings.filterwarnings('ignore', category=DeprecationWarning)
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings('ignore', category=ResourceWarning)

def validate_onnx(onnx_file):
import onnx
try:
print('Validating the ONNX model...')
if isinstance(onnx_file, str): # If it's a file path, load it
model = onnx.load(onnx_file)
else: # Assume it's already a ModelProto object
model = onnx_file
onnx.checker.check_model(model) # Validate the loaded model
print('ONNX model validation: Success!')
except Exception as e: # Catch all validation-related errors
print(f'ONNX model validation failed: {e}')

def main(args):
suppress_warnings()

print(f'\nStarting: {args.weights}')

print('Opening YOLOv5 model')
device = torch.device('cpu')
model = yolov5_export(args.weights, device)

if len(model.names.keys()) > 0:
    print('Creating labels.txt file')
    with open('labels.txt', 'w', encoding='utf-8') as f:
        for name in model.names.values():
            f.write(f'{name}\n')

# Set split_outputs to True if you need separate outputs
model = nn.Sequential(model, DeepStreamOutput(split_outputs=args.split_outputs))

img_size = args.size * 2 if len(args.size) == 1 else args.size
if img_size == [640, 640] and args.p6:
    img_size = [1280] * 2

onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
onnx_output_file = f'{args.weights}.onnx'

dynamic_axes = {'input': {0: 'batch'}}
if args.split_outputs:
    dynamic_axes.update({'boxes': {0: 'batch'}, 'scores': {0: 'batch'}, 'classes': {0: 'batch'}})
    output_names = ['boxes', 'scores', 'classes']
else:
    dynamic_axes.update({'output': {0: 'batch'}})
    output_names = ['output']

print('Exporting the model to ONNX')
torch.onnx.export(
    model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
    input_names=['input'], output_names=output_names,
    dynamic_axes=dynamic_axes if args.dynamic else None
)

if args.simplify:
    print('Simplifying the ONNX model')
    import onnxslim
    model_onnx = onnx.load(onnx_output_file)
    model_onnx = onnxslim.slim(model_onnx)
    validate_onnx(model_onnx)
    onnx.save(model_onnx, onnx_output_file)

print(f'Done: {onnx_output_file}\n')

def parse_args():
import argparse
parser = argparse.ArgumentParser(description='DeepStream YOLOv5 conversion')
parser.add_argument('-w', '--weights', required=True, type=str, help='Input weights (.pt) file path (required)')
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
parser.add_argument('--p6', action='store_true', help='P6 model')
parser.add_argument('--opset', type=int, default=17, help='ONNX opset version')
parser.add_argument('--simplify', action='store_true', help='ONNX simplify model')
parser.add_argument('--dynamic', action='store_true', help='Dynamic batch-size')
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
parser.add_argument('--split_outputs', action='store_true', help='Split outputs into boxes, scores, and classes')
args = parser.parse_args()

if not os.path.isfile(args.weights):
    raise SystemExit('Invalid weights file')
if args.dynamic and args.batch > 1:
    raise SystemExit('Cannot set dynamic batch-size and static batch-size at same time')
return args

if name == 'main':
args = parse_args()
main(args)
But it doesnt want to do inference

@marcoslucianops
Copy link
Owner

Use the https://github.com/marcoslucianops/DeepStream-Yolo/blob/master/utils/export_yoloV5.py

Remove

        convert_matrix = torch.tensor(
            [[1, 0, 1, 0], [0, 1, 0, 1], [-0.5, 0, 0.5, 0], [0, -0.5, 0, 0.5]], dtype=boxes.dtype, device=boxes.device
        )
        boxes @= convert_matrix

Change

return torch.cat([boxes, scores, labels.to(boxes.dtype)], dim=-1)

to

return boxes, scores, labels.to(boxes.dtype)

Change

    dynamic_axes = {
        'input': {
            0: 'batch'
        },
        'output': {
            0: 'batch'
        }
    }

to

    dynamic_axes = {
        'input': {
            0: 'batch'
        },
        'boxes': {
            0: 'batch'
        },
        'scores': {
            0: 'batch'
        },
        'labels': {
            0: 'batch'
        }
    }

Change

    torch.onnx.export(
        model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
        input_names=['input'], output_names=['output'], dynamic_axes=dynamic_axes if args.dynamic else None
    )

to

    torch.onnx.export(
        model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
        input_names=['input'], output_names=['boxes', 'scores', 'labels'],
        dynamic_axes=dynamic_axes if args.dynamic else None
    )

@valentin-phoenix
Copy link

valentin-phoenix commented Nov 26, 2024

I’m encountering a similar behavior with the export of YOLOv8 models. Could you provide a more detailed explanation of why these changes have been implemented in the export process?

@Sanelembuli98
Copy link
Author

https://github.com/marcoslucianops/DeepStream-Yolo/blob/master/utils/export_yoloV5.py still results in WARNING: [TRT]: The getMaxBatchSize() function should not be used with an engine built from a network created with NetworkDefinitionCreationFlag::kEXPLICIT_BATCH flag. This function will always return 1.
INFO: [Implicit Engine Info]: layers num: 2
0 INPUT kFLOAT input 3x640x640
1 OUTPUT kFLOAT output 25200x6
I need
layers num: 4
0 INPUT kFLOAT input 3x640x640
1 OUTPUT kFLOAT boxes 25200x4
2 OUTPUT kINT32 classes 25200x1
3 OUTPUT kFLOAT scores 25200x1

@marcoslucianops
Copy link
Owner

@Sanelembuli98 Please add to the file the changes I said #595 (comment)

@valentin-phoenix The TensorRT sometimes doesn't keep the output order on the layers causing a bug on the output (more related to Paddle models).

@Sanelembuli98
Copy link
Author

import os
import onnx
import torch
import torch.nn as nn
from models.experimental import attempt_load

class DeepStreamOutput(nn.Module):
def init(self, split_outputs=False):
"""
:param split_outputs: If True, the forward method will return separate tensors
(boxes, scores, classes) instead of a concatenated tensor.
"""
super().init()
self.split_outputs = split_outputs

def forward(self, x):
    x = x[0]  # Assuming the first tensor in the list is the output
    boxes = x[:, :, :4]
    
    # Define the conversion matrix for box adjustments

    # Extract objectness scores and class scores
    objectness = x[:, :, 4:5]  # Shape: [batch_size, num_boxes, 1]
    scores, labels = torch.max(x[:, :, 5:], dim=-1, keepdim=True)  # Shape: [batch_size, num_boxes, 1]
    
    # Adjust scores using objectness
    scores = scores * objectness

    print(f"Boxes shape: {boxes.shape}, Scores shape: {scores.shape}, Labels shape: {labels.shape}")

    if self.split_outputs:
        # Directly return boxes, scores, and labels for separate outputs
        return boxes, scores, labels
    
    else:
        # Ensure labels are converted to match the dtype of `boxes` for concatenation
        labels = labels.to(dtype=boxes.dtype)  # Convert labels to match the dtype of boxes
        return boxes, scores, labels.to(boxes.dtype)

def parse_tensorrt_output(output):
"""
Parses TensorRT output, assuming the shape is [25200, 6].
:param output: Tensor with shape [25200, 6].
:return: boxes, confidence scores, and class predictions as separate tensors.
"""
boxes = output[:, :4]
confidence = output[:, 4:5]
classes = output[:, 5:]
return boxes, confidence, classes

def yolov5_export(weights, device, inplace=True, fuse=True):
model = attempt_load(weights, device=device, inplace=inplace, fuse=fuse)
model.eval()
for k, m in model.named_modules():
if m.class.name == 'Detect':
m.inplace = False
m.dynamic = False
m.export = True
return model

def suppress_warnings():
import warnings
warnings.filterwarnings('ignore', category=torch.jit.TracerWarning)
warnings.filterwarnings('ignore', category=UserWarning)
warnings.filterwarnings('ignore', category=DeprecationWarning)
warnings.filterwarnings('ignore', category=FutureWarning)
warnings.filterwarnings('ignore', category=ResourceWarning)

def validate_onnx(onnx_file):
import onnx
try:
print('Validating the ONNX model...')
if isinstance(onnx_file, str): # If it's a file path, load it
model = onnx.load(onnx_file)
else: # Assume it's already a ModelProto object
model = onnx_file
onnx.checker.check_model(model) # Validate the loaded model
print('ONNX model validation: Success!')
except Exception as e: # Catch all validation-related errors
print(f'ONNX model validation failed: {e}')

def main(args):
suppress_warnings()

print(f'\nStarting: {args.weights}')

print('Opening YOLOv5 model')
device = torch.device('cpu')
model = yolov5_export(args.weights, device)

if len(model.names.keys()) > 0:
    print('Creating labels.txt file')
    with open('labels.txt', 'w', encoding='utf-8') as f:
        for name in model.names.values():
            f.write(f'{name}\n')

# Set split_outputs to True if you need separate outputs
model = nn.Sequential(model, DeepStreamOutput(split_outputs=args.split_outputs))

img_size = args.size * 2 if len(args.size) == 1 else args.size
if img_size == [640, 640] and args.p6:
    img_size = [1280] * 2

onnx_input_im = torch.zeros(args.batch, 3, *img_size).to(device)
onnx_output_file = f'{args.weights}.onnx'

dynamic_axes = {
    'input': {
        0: 'batch'
    },
    'boxes': {
        0: 'batch'
    },
    'scores': {
        0: 'batch'
    },
    'labels': {
        0: 'batch'
    }
}
if args.split_outputs:
    dynamic_axes.update({'boxes': {0: 'batch'}, 'scores': {0: 'batch'}, 'classes': {0: 'batch'}})
    output_names = ['boxes', 'scores', 'classes']
else:
    dynamic_axes.update({'output': {0: 'batch'}})
    output_names = ['output']

print('Exporting the model to ONNX')
torch.onnx.export(
    model, onnx_input_im, onnx_output_file, verbose=False, opset_version=args.opset, do_constant_folding=True,
    input_names=['input'], output_names=['boxes', 'scores', 'labels'],
    dynamic_axes=dynamic_axes if args.dynamic else None
)

if args.simplify:
    print('Simplifying the ONNX model')
    import onnxslim
    model_onnx = onnx.load(onnx_output_file)
    model_onnx = onnxslim.slim(model_onnx)
    validate_onnx(model_onnx)
    onnx.save(model_onnx, onnx_output_file)

print(f'Done: {onnx_output_file}\n')

def parse_args():
import argparse
parser = argparse.ArgumentParser(description='DeepStream YOLOv5 conversion')
parser.add_argument('-w', '--weights', required=True, type=str, help='Input weights (.pt) file path (required)')
parser.add_argument('-s', '--size', nargs='+', type=int, default=[640], help='Inference size [H,W] (default [640])')
parser.add_argument('--p6', action='store_true', help='P6 model')
parser.add_argument('--opset', type=int, default=17, help='ONNX opset version')
parser.add_argument('--simplify', action='store_true', help='ONNX simplify model')
parser.add_argument('--dynamic', action='store_true', help='Dynamic batch-size')
parser.add_argument('--batch', type=int, default=1, help='Static batch-size')
parser.add_argument('--split_outputs', action='store_true', help='Split outputs into boxes, scores, and classes')
args = parser.parse_args()

if not os.path.isfile(args.weights):
    raise SystemExit('Invalid weights file')
if args.dynamic and args.batch > 1:
    raise SystemExit('Cannot set dynamic batch-size and static batch-size at same time')
return args

if name == 'main':
args = parse_args()
main(args)
I implemented the changes in the above and as before I am able to generate the .engine file and dont get any errors the issue is the resulting .engine file when its run on the deep stream app and custom test apps it does not infere/detect the classes its supposed to.

@marcoslucianops
Copy link
Owner

Are you using the updated nvdsinfer_custom_impl_Yolo or the old plugin?

@Sanelembuli98
Copy link
Author

Most likely the old one. I will update it and give feedback thank you.

@Sanelembuli98
Copy link
Author

so I updated to using nvdsinfer_custom_impl_Yolo I am able to generate the engine file and actually run the deepstream-app -c deepstream_app_config.txt but it does not detect/infer on the video/stream and for my weights I used default yolov5s.pt

@Sanelembuli98
Copy link
Author

additional info is I am able to run detection/inference using the yolov5s.pt but after I export it as an onnx I cant. This is the command I am running to export python3 export_yoloV5.py --weights yolov5s.pt --size 640 --simplify --dynamic --opset 17

@Sanelembuli98
Copy link
Author

maybe I am failing to properly detail my issue. please try test deepstream-app. And let me know if you are able to run detection/inference as I am not able to. Hopefully you can replicate my issue or point me in the right direction.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants