You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
多输入模型转换时添加--optimizePrefer 2选项,Convert MatMul Convolution use shared const B inputs,
提升了推理速度,但模型增大,我期望通过量化减小模型大小以降低内存占用
转换命令如下:
./MNNConvert -f ONNX --modelFile small.onnx --MNNModel small_opt2.mnn --bizCode biz --optimizePrefer 2 --forTraining
日志:
The device support i8sdot:0, support fp16:0, support i8mm: 0
Start to Convert Other Model Format To MNN Model..., target version: 2.9
[14:33:55] /home/lixw/MNN/tools/converter/source/onnx/onnxConverter.cpp:46: ONNX Model ir version: 4
[14:33:55] /home/lixw/MNN/tools/converter/source/onnx/onnxConverter.cpp:47: ONNX Model opset version: 9
Start to Optimize the MNN Net...
[14:33:55] /home/lixw/MNN/tools/converter/source/optimizer/PostConverter.cpp:225: convert model for training, reserve BatchNorm and Dropout
Convert MatMul Convolution use shared const B inputs, may increase the model size
[14:33:55] /home/lixw/MNN/tools/converter/source/optimizer/PostConverter.cpp:225: convert model for training, reserve BatchNorm and Dropout
inputTensors : [ cahce_c0, cahce_spec, feat_spec, cahce_erb, feat_erb, h0, h2, h1, spec, ]
outputTensors: [ cahce_c0o, cahce_erbo, cahce_speco, df_coefs, ho0, ho1, ho2, speco, ]
Converted Success!
基于该步骤转化后的模型进行多输入模型的量化,量化后的模型在x86 cpu 和 arm cpu 后端推理结果不对齐,而且有的输出值为nan,
arm cpu后端推理似乎在处理使用共享常量B的卷积以及反量化过程中存在问题?(x86cpu的应该是正常的)
ARM CPU MNN 2.9.0编译
set(CMAKE_SYSTEM_NAME Linux)
set(CMAKE_SYSTEM_PROCESSOR aarch64)
set(CMAKE_TRY_COMPILE_TARGET_TYPE "STATIC_LIBRARY")
set(CMAKE_BUILD_TYPE Release CACHE STRING "build release" FORCE)
set(CMAKE_INSTALL_PREFIX package CACHE STRING "install path" FORCE)
set(MNN_ARM82 OFF CACHE STRING "build arm82" FORCE)
set(MNN_FORBID_MULTI_THREAD OFF CACHE STRING "build single thread" FORCE)
set(MNN_USE_THREAD_POOL ON CACHE STRING "build thread pool" FORCE)
set(MNN_SUPPORT_BF16 OFF CACHE STRING "build bf16" FORCE)
set(MNN_BUILD_SHARED_LIBS OFF CACHE STRING "build static" FORCE)
set(MNN_SUPPORT_TFLITE_QUAN OFF CACHE STRING "" FORCE)
set(MNN_SEP_BUILD OFF CACHE STRING "build sep" FORCE)
set(MNN_USE_SSE OFF CACHE STRING "use sse" FORCE)
set(MNN_USE_LOGCAT OFF CACHE STRING "sue log cat" FORCE)
x86 CPU MNN 2.9.0编译
cmake .. -DMNN_BUILD_SHARED_LIBS=OFF -DMNN_BUILD_TOOLS=OFF && make -j4
编译日志:
ARM CPU MNN 2.9.0编译日志,编译成功
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- The ASM compiler identification is GNU
-- Found assembler: /usr/bin/cc
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Use Threadpool, forbid openmp
-- >>>>>>>>>>>>>
-- MNN BUILD INFO:
-- System: Linux
-- Processor: aarch64
-- Version: 2.9.0
-- Metal: OFF
-- OpenCL: OFF
-- OpenGL: OFF
-- Vulkan: OFF
-- ARM82: OFF
-- oneDNN: OFF
-- TensorRT: OFF
-- CoreML: OFF
-- NNAPI: OFF
-- CUDA: OFF
-- OpenMP: OFF
-- BF16: OFF
-- ThreadPool: ON
-- Hidden: TRUE
-- Build Path: /sdcard/lixiangwei/MNN/build_static
-- CUDA PROFILE: OFF
-- Enabling AArch64 Assemblies
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /sdcard/lixiangwei/MNN/build_static
x86 CPU MNN 2.9.0编译日志,编译成功
-- Use Threadpool, forbid openmp
-- >>>>>>>>>>>>>
-- MNN BUILD INFO:
-- System: Linux
-- Processor: x86_64
-- Version: 2.9.0
-- Metal: OFF
-- OpenCL: OFF
-- OpenGL: OFF
-- Vulkan: OFF
-- ARM82: OFF
-- oneDNN: OFF
-- TensorRT: OFF
-- CoreML: OFF
-- NNAPI: OFF
-- CUDA: OFF
-- OpenMP: OFF
-- BF16: OFF
-- ThreadPool: ON
-- Hidden: TRUE
-- Build Path: /home/lixw/MNN/build_static
-- CUDA PROFILE: OFF
-- WIN_USE_ASM:
-- x86_64: Open SSE
-- MNN_AVX512:OFF
-- Configuring done
-- Generating done
-- Build files have been written to: /home/lixw/MNN/build_static
The text was updated successfully, but these errors were encountered:
问题描述:
多输入模型转换时添加--optimizePrefer 2选项,Convert MatMul Convolution use shared const B inputs,
提升了推理速度,但模型增大,我期望通过量化减小模型大小以降低内存占用
转换命令如下:
./MNNConvert -f ONNX --modelFile small.onnx --MNNModel small_opt2.mnn --bizCode biz --optimizePrefer 2 --forTraining
日志:
The device support i8sdot:0, support fp16:0, support i8mm: 0
Start to Convert Other Model Format To MNN Model..., target version: 2.9
[14:33:55] /home/lixw/MNN/tools/converter/source/onnx/onnxConverter.cpp:46: ONNX Model ir version: 4
[14:33:55] /home/lixw/MNN/tools/converter/source/onnx/onnxConverter.cpp:47: ONNX Model opset version: 9
Start to Optimize the MNN Net...
[14:33:55] /home/lixw/MNN/tools/converter/source/optimizer/PostConverter.cpp:225: convert model for training, reserve BatchNorm and Dropout
Convert MatMul Convolution use shared const B inputs, may increase the model size
[14:33:55] /home/lixw/MNN/tools/converter/source/optimizer/PostConverter.cpp:225: convert model for training, reserve BatchNorm and Dropout
inputTensors : [ cahce_c0, cahce_spec, feat_spec, cahce_erb, feat_erb, h0, h2, h1, spec, ]
outputTensors: [ cahce_c0o, cahce_erbo, cahce_speco, df_coefs, ho0, ho1, ho2, speco, ]
Converted Success!
基于该步骤转化后的模型进行多输入模型的量化,量化后的模型在x86 cpu 和 arm cpu 后端推理结果不对齐,而且有的输出值为nan,
arm cpu后端推理似乎在处理使用共享常量B的卷积以及反量化过程中存在问题?(x86cpu的应该是正常的)
单独测试,
(1)仅转换时添加--optimizePrefer 2选项以转换MatMul Convolution使用共享常量B,不量化,x86 cpu与arm cpu可基本对齐
(2)不添加--optimizePrefer 2选项,直接量化,x86 cpu与arm cpu也能基本对齐
因此,只有当二者同时执行,即--optimizePrefer 2 加 量化,会出现arm cpu 推理错误的情况?这是否是bug呢?
我提供了简化版模型以及复现该现象的记录
google https://drive.google.com/file/d/1Z1Fy9ClfhhZcrRJXsrPbRkwmK22wtnAE/view?usp=sharing
baidu链接:https://pan.baidu.com/s/1X4h32GqvkY1NgkiZuPxZ5g?pwd=umxf
提取码:umxf
平台:
x86 CPU 及 ARMv8 CPU
Github版本:
2.9.0
编译方式:
ARM CPU MNN 2.9.0编译
set(CMAKE_SYSTEM_NAME Linux)
set(CMAKE_SYSTEM_PROCESSOR aarch64)
set(CMAKE_TRY_COMPILE_TARGET_TYPE "STATIC_LIBRARY")
set(CMAKE_BUILD_TYPE Release CACHE STRING "build release" FORCE)
set(CMAKE_INSTALL_PREFIX package CACHE STRING "install path" FORCE)
set(MNN_ARM82 OFF CACHE STRING "build arm82" FORCE)
set(MNN_FORBID_MULTI_THREAD OFF CACHE STRING "build single thread" FORCE)
set(MNN_USE_THREAD_POOL ON CACHE STRING "build thread pool" FORCE)
set(MNN_SUPPORT_BF16 OFF CACHE STRING "build bf16" FORCE)
set(MNN_BUILD_SHARED_LIBS OFF CACHE STRING "build static" FORCE)
set(MNN_SUPPORT_TFLITE_QUAN OFF CACHE STRING "" FORCE)
set(MNN_SEP_BUILD OFF CACHE STRING "build sep" FORCE)
set(MNN_USE_SSE OFF CACHE STRING "use sse" FORCE)
set(MNN_USE_LOGCAT OFF CACHE STRING "sue log cat" FORCE)
x86 CPU MNN 2.9.0编译
cmake .. -DMNN_BUILD_SHARED_LIBS=OFF -DMNN_BUILD_TOOLS=OFF && make -j4
编译日志:
ARM CPU MNN 2.9.0编译日志,编译成功
-- The C compiler identification is GNU 7.5.0
-- The CXX compiler identification is GNU 7.5.0
-- The ASM compiler identification is GNU
-- Found assembler: /usr/bin/cc
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - done
-- Check for working CXX compiler: /usr/bin/c++ - skipped
-- Detecting CXX compile features
-- Detecting CXX compile features - done
-- Use Threadpool, forbid openmp
-- >>>>>>>>>>>>>
-- MNN BUILD INFO:
-- System: Linux
-- Processor: aarch64
-- Version: 2.9.0
-- Metal: OFF
-- OpenCL: OFF
-- OpenGL: OFF
-- Vulkan: OFF
-- ARM82: OFF
-- oneDNN: OFF
-- TensorRT: OFF
-- CoreML: OFF
-- NNAPI: OFF
-- CUDA: OFF
-- OpenMP: OFF
-- BF16: OFF
-- ThreadPool: ON
-- Hidden: TRUE
-- Build Path: /sdcard/lixiangwei/MNN/build_static
-- CUDA PROFILE: OFF
-- Enabling AArch64 Assemblies
-- Looking for pthread.h
-- Looking for pthread.h - found
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD
-- Performing Test CMAKE_HAVE_LIBC_PTHREAD - Success
-- Found Threads: TRUE
-- Configuring done
-- Generating done
-- Build files have been written to: /sdcard/lixiangwei/MNN/build_static
x86 CPU MNN 2.9.0编译日志,编译成功
-- Use Threadpool, forbid openmp
-- >>>>>>>>>>>>>
-- MNN BUILD INFO:
-- System: Linux
-- Processor: x86_64
-- Version: 2.9.0
-- Metal: OFF
-- OpenCL: OFF
-- OpenGL: OFF
-- Vulkan: OFF
-- ARM82: OFF
-- oneDNN: OFF
-- TensorRT: OFF
-- CoreML: OFF
-- NNAPI: OFF
-- CUDA: OFF
-- OpenMP: OFF
-- BF16: OFF
-- ThreadPool: ON
-- Hidden: TRUE
-- Build Path: /home/lixw/MNN/build_static
-- CUDA PROFILE: OFF
-- WIN_USE_ASM:
-- x86_64: Open SSE
-- MNN_AVX512:OFF
-- Configuring done
-- Generating done
-- Build files have been written to: /home/lixw/MNN/build_static
The text was updated successfully, but these errors were encountered: