Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug] Mask2former tensorrt inference time is abnormal, very slow compared to pytorch inference #2816

Open
2 of 3 tasks
huangshihong opened this issue Aug 21, 2024 · 0 comments

Comments

@huangshihong
Copy link

Checklist

  • I have searched related issues but cannot get the expected help.
  • 2. I have read the FAQ documentation but cannot get the expected help.
  • 3. The bug has not been fixed in the latest version.

Describe the bug

I converted a version of the tensorrt model under Linux and Windows respectively, and used it for reasoning and found that the infer time was much slower than that of pytorch infer. The Pytorch infer time is 0.09s and the tensorrt infer time is 0.3s

Reproduction

pytorch_infer:
odel_mmseg =init_model(config, checkpoint, device=torch.device('cuda:0'))
img = cv2.imread("F:/AL_model/p4_20240806152513412_r1_c2.png")
for i in range(0,10):
start = time.time()
outputtensor = inference_model(model_mmseg, img)
pred_mask = outputtensor.pred_sem_seg.data.squeeze(
0).detach().cpu().numpy().astype(np.uint8)
end = time.time()
print(f"The time is {end -start}")

TRT infer:
import numpy as np
import cv2
import pycuda.driver as cuda
import pycuda.autoinit
import tensorrt as trt
import time
import ctypes

定义输入和输出的维度

mmdeployop_dll_path = "E:\workshop\mmdeploy-1.3.1\mmdeploy\lib/mmdeploy_tensorrt_ops.dll" # 替换为你的
ctypes.CDLL(mmdeployop_dll_path)
engine_file = r'./end2end.engine'
logger= trt.Logger(trt.Logger.VERBOSE)
with open(engine_file, 'rb') as f, trt.Runtime(logger) as runtime:
trt.init_libnvinfer_plugins(logger, "mmdeploy");
engine = runtime.deserialize_cuda_engine(f.read())
imgsz=1024
context = engine.create_execution_context()
input_shape = (1,3, imgsz, imgsz)
output_shape = (1,1,imgsz, imgsz)
input_size = np.prod(input_shape) * np.dtype(np.float32).itemsize
output_size = np.prod(output_shape) * np.dtype(np.int32).itemsize
print("执行成功")
image = cv2.imread(image_file)
image = cv2.resize(image, (imgsz, imgsz))
image = image.astype(np.float32)
mean =[123.675,116.28,103.53]
std = [58.395,57.12,57.375]
image = np.transpose(image, (2, 0, 1))
for i in range(image.shape[0]): # Iterate over channels
image[i] = (image[i] - mean[i]) / std[i]
input_tensor = np.expand_dims(image, axis=0)
print('img_tensor.shape', input_tensor.shape)
input_tensor = np.ascontiguousarray(input_tensor)
d_input = cuda.mem_alloc(input_tensor.nbytes)
stream = cuda.Stream()
cuda.memcpy_htod_async(d_input, input_tensor, stream)
d_input = cuda.mem_alloc(image.nbytes)
output_data = np.zeros(output_shape, dtype=np.int32)
stream = cuda.Stream()
d_output = cuda.mem_alloc(output_data.nbytes)

context.set_input_shape("input", (origin_inputshape))
for i in range(0,10):
cuda.memcpy_htod_async(d_input, image.ravel(),stream)
stream.synchronize()
stream_handle = stream.handle
start =time.time()
context.execute_async_v2(bindings=[int(d_input), int(d_output)], stream_handle=stream_handle)
stream.synchronize()
end = time.time()
cuda.memcpy_dtoh_async(output_data, d_output, stream)
stream.synchronize()

Environment

GPU:RTX3060
Package                       Version
----------------------------- ------------
absl-py                       2.1.0
accelerate                    0.20.3
addict                        2.4.0
aenum                         3.1.15
aiohttp                       3.8.6
aiosignal                     1.3.1
albumentations                1.3.1
aliyun-python-sdk-core        2.15.0
aliyun-python-sdk-kms         2.16.2
altgraph                      0.17.4
appdirs                       1.4.4
async-timeout                 4.0.3
asynctest                     0.13.0
attrs                         24.2.0
backcall                      0.2.0
certifi                       2022.12.7
cffi                          1.15.1
charset-normalizer            3.3.2
click                         8.1.7
colorama                      0.4.6
coloredlogs                   15.0.1
comm                          0.1.4
crcmod                        1.7
cryptography                  42.0.5
cycler                        0.11.0
Cython                        3.0.10
datasets                      2.13.2
decorator                     5.1.1
diffusers                     0.21.4
dill                          0.3.6
einops                        0.6.1
filelock                      3.12.2
flatbuffers                   24.3.25
fonttools                     4.38.0
frozenlist                    1.3.3
fsspec                        2023.1.0
ftfy                          6.1.1
grpcio                        1.62.2
grpcio-tools                  1.62.2
huggingface-hub               0.16.4
humanfriendly                 10.0
idna                          3.7
imageio                       2.31.2
importlib-metadata            6.7.0
instaboostfast                0.1.2
ipython                       7.34.0
ipywidgets                    8.1.3
jedi                          0.19.1
jmespath                      0.10.0
joblib                        1.3.2
jupyterlab_widgets            3.0.11
kiwisolver                    1.4.5
Mako                          1.2.4
Markdown                      3.4.4
markdown-it-py                2.2.0
MarkupSafe                    2.1.5
matplotlib                    3.5.3
matplotlib-inline             0.1.6
mdurl                         0.1.2
mmcv-full                     1.5.0
mmdeploy                      0.14.0
mmdet                         2.22.0
mmengine                      0.10.3
mmsegmentation                0.20.2
model-index                   0.1.11
modelscope                    1.17.1
mpmath                        1.3.0
multidict                     6.0.5
multiprocess                  0.70.14
MultiScaleDeformableAttention 1.0
networkx                      2.6.3
numpy                         1.21.6
onnx                          1.14.1
onnx-graphsurgeon             0.5.2
onnxruntime                   1.14.1
onnxruntime-gpu               1.14.1
opencv-python                 4.10.0.84
opencv-python-headless        4.10.0.84
opendatalab                   0.0.10
openmim                       0.3.9
openxlab                      0.0.10
ordered-set                   4.1.0
oss2                          2.17.0
packaging                     24.0
pandas                        1.3.5
parso                         0.8.4
pefile                        2023.2.7
pickleshare                   0.7.5
Pillow                        9.5.0
pip                           24.0
platformdirs                  4.0.0
prettytable                   3.7.0
prometheus-client             0.17.1
prompt_toolkit                3.0.47
protobuf                      3.20.2
psutil                        6.0.0
psycopg2                      2.9.9
pyarrow                       12.0.1
pycocotools                   2.0.7
pycparser                     2.21
pycryptodome                  3.20.0
pycuda                        2022.1
Pygments                      2.17.2
pyinstall                     0.1.4
pyinstaller                   5.13.2
pyinstaller-hooks-contrib     2024.7
pyparsing                     3.1.2
pyreadline                    2.1
python-dateutil               2.9.0.post0
pytools                       2022.1.12
pytz                          2023.4
PyWavelets                    1.3.0
pywin32                       306
pywin32-ctypes                0.2.2
PyYAML                        6.0.1
qudida                        0.0.4
regex                         2023.12.25
requests                      2.28.2
rich                          13.7.1
safetensors                   0.4.4
scikit-image                  0.19.3
scikit-learn                  1.0.2
scipy                         1.7.3
setuptools                    60.2.0
six                           1.16.0
some-package                  0.1
sympy                         1.10.1
tabulate                      0.9.0
tensorrt                      8.6.1
termcolor                     2.3.0
terminaltables                3.1.10
threadpoolctl                 3.1.0
tifffile                      2021.11.2
timm                          0.4.12
tokenizers                    0.13.3
toml                          0.10.2
tomli                         2.0.1
torch                         1.11.0+cu113
torchaudio                    0.11.0+cu113
torchsummary                  1.5.1
torchvision                   0.12.0+cu113
tqdm                          4.65.2
traitlets                     5.9.0
transformers                  4.30.2
typing_extensions             4.7.1
urllib3                       1.26.18
wcwidth                       0.2.13
wheel                         0.38.4
widgetsnbextension            4.0.11
wincertstore                  0.2
xxhash                        3.4.1
yapf                          0.40.2
yarl                          1.9.4
zipp                          3.15.0
PATH=D:\soft\anaconda_setup\envs\mmseg-vit;D:\soft\anaconda_setup\envs\mmseg-vit\Library\mingw-w64\bin;D:\soft\anaconda_setup\envs\mmseg-vit\Library\usr\bin;D:\soft\anaconda_setup\envs\mmseg-vit\Library\bin;D:\soft\anaconda_setup\envs\mmseg-vit\Scripts;D:\soft\anaconda_setup\envs\mmseg-vit\bin;D:\soft\anaconda_setup\condabin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\libnvvp;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\lib;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\include;D:\soft\anaconda_setup;D:\soft\anaconda_setup\Library\mingw-w64\bin;D:\soft\anaconda_setup\Library\usr\bin;D:\soft\anaconda_setup\Library\bin;D:\soft\anaconda_setup\Scripts;C:\Program Files (x86)\Common Files\MVS\Runtime\Win32_i86;C:\Program Files (x86)\Common Files\MVS\Runtime\Win64_x64;C:\Windows\system32;C:\Windows;C:\Windows\System32\Wbem;C:\Windows\System32\WindowsPowerShell\v1.0;C:\Windows\System32\OpenSSH;C:\Users\Admin\anaconda3;C:\Users\Admin\anaconda3\Scripts;C:\Users\Admin\anaconda3\Library\bin;C:\Users\Admin\anaconda3\Library\mingw-w64\bin;C:\Program Files\Git\cmd;C:\Program Files (x86)\NVIDIA Corporation\PhysX\Common;D:\soft\dvc\DVC (Data Version Control);C:\Program Files\nodejs;C:\Program Files\NVIDIA Corporation\NVIDIA NvDLISR;C:\Program Files\Git\cmd;C:\Program Files\Git\usr\bin;C:\Program Files\Git\mingw64\bin;D:\soft\opencv-4.8.0\opencv\build\x64\vc16\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\bin;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.7\libnvvp;D:\soft\graphviz\bin;C:\Program Files\NVIDIA Corporation\Nsight Compute 2021.2.2;C:\Program Files\CMake\bin;D:\soft\onnxruntime-win-x64-gpu-1.15.1\lib;D:\soft\TensorRT-8.6.1.6.Windows10.x86_64.cuda-11.8\TensorRT-8.6.1.6\lib;C:\Users\Admin\AppData\Local\Microsoft\WindowsApps;D:\soft\pycharm\PyCharm Community Edition 2021.1.1\bin;D:\soft\vscode\Microsoft VS Code\bin;C:\Users\Admin\AppData\Roaming\npm;D:\soft\clion\CLion 2022.3.2\bin;D:\soft\pycharm2023\PyCharm 2023.2.3\bin

Error traceback

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant