Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

retinaface model to onnx #15892

Open
Zheweiqiu opened this issue Aug 14, 2019 · 26 comments
Open

retinaface model to onnx #15892

Zheweiqiu opened this issue Aug 14, 2019 · 26 comments

Comments

@Zheweiqiu
Copy link

Note: Providing complete information in the most concise form is the best way to get help. This issue template serves as the checklist for essential information to most of the technical issues and bug reports. For non-technical issues and feature requests, feel free to present the information in what you believe is the best form.

For Q & A and discussion, please start a discussion thread at https://discuss.mxnet.io

Description

got error exporting retinaface model to onnx. But it worked when I tried insightface model.

Environment info (Required)

----------Python Info----------
Version : 3.7.3
Compiler : GCC 7.3.0
Build : ('default', 'Mar 27 2019 22:11:17')
Arch : ('64bit', '')
------------Pip Info-----------
Version : 19.1.1
Directory : /home/qiuzhewei/anaconda3/lib/python3.7/site-packages/pip
----------MXNet Info-----------
Version : 1.5.0
Directory : /home/qiuzhewei/anaconda3/lib/python3.7/site-packages/mxnet
Commit Hash : 75a9e18
Library : ['/home/qiuzhewei/anaconda3/lib/python3.7/site-packages/mxnet/libmxnet.so']
Build features:
✖ CUDA
✖ CUDNN
✖ NCCL
✖ CUDA_RTC
✖ TENSORRT
✔ CPU_SSE
✔ CPU_SSE2
✔ CPU_SSE3
✔ CPU_SSE4_1
✔ CPU_SSE4_2
✖ CPU_SSE4A
✔ CPU_AVX
✖ CPU_AVX2
✖ OPENMP
✖ SSE
✔ F16C
✖ JEMALLOC
✖ BLAS_OPEN
✖ BLAS_ATLAS
✖ BLAS_MKL
✖ BLAS_APPLE
✔ LAPACK
✖ MKLDNN
✔ OPENCV
✖ CAFFE
✖ PROFILER
✔ DIST_KVSTORE
✖ CXX14
✖ INT64_TENSOR_SIZE
✔ SIGNAL_HANDLER
✖ DEBUG
----------System Info----------
Platform : Linux-4.15.0-55-generic-x86_64-with-debian-stretch-sid
system : Linux
node : chaowei-SYS-7048GR-TR
release : 4.15.0-55-generic
version : #60~16.04.2-Ubuntu SMP Thu Jul 4 09:03:09 UTC 2019
----------Hardware Info----------
machine : x86_64
processor : x86_64
Architecture: x86_64
CPU 运行模式: 32-bit, 64-bit
Byte Order: Little Endian
CPU(s): 12
On-line CPU(s) list: 0-11
每个核的线程数:1
每个座的核数: 6
Socket(s): 2
NUMA 节点: 2
厂商 ID: GenuineIntel
CPU 系列: 6
型号: 79
Model name: Intel(R) Xeon(R) CPU E5-2603 v4 @ 1.70GHz
步进: 1
CPU MHz: 1316.511
CPU max MHz: 1700.0000
CPU min MHz: 1200.0000
BogoMIPS: 3403.17
虚拟化: VT-x
L1d 缓存: 32K
L1i 缓存: 32K
L2 缓存: 256K
L3 缓存: 15360K
NUMA node0 CPU(s): 0-5
NUMA node1 CPU(s): 6-11
Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx pdpe1gb rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx smx est tm2 ssse3 sdbg fma cx16 xtpr pdcm pcid dca sse4_1 sse4_2 x2apic movbe popcnt tsc_deadline_timer aes xsave avx f16c rdrand lahf_lm abm 3dnowprefetch cpuid_fault epb cat_l3 cdp_l3 invpcid_single pti intel_ppin ssbd ibrs ibpb stibp tpr_shadow vnmi flexpriority ept vpid fsgsbase tsc_adjust bmi1 hle avx2 smep bmi2 erms invpcid rtm cqm rdt_a rdseed adx smap intel_pt xsaveopt cqm_llc cqm_occup_llc cqm_mbm_total cqm_mbm_local dtherm arat pln pts md_clear flush_l1d
----------Network Test----------
Setting timeout: 10
Timing for MXNet: https://github.com/apache/incubator-mxnet, DNS: 0.0016 sec, LOAD: 1.2854 sec.
Timing for Gluon Tutorial(en): http://gluon.mxnet.io, DNS: 0.3885 sec, LOAD: 1.5392 sec.
Timing for Gluon Tutorial(cn): https://zh.gluon.ai, DNS: 0.0077 sec, LOAD: 2.0701 sec.
Timing for FashionMNIST: https://apache-mxnet.s3-accelerate.dualstack.amazonaws.com/gluon/dataset/fashion-mnist/train-labels-idx1-ubyte.gz, DNS: 9.3760 sec, LOAD: 0.8177 sec.
Timing for PYPI: https://pypi.python.org/pypi/pip, DNS: 0.0007 sec, LOAD: 0.9090 sec.
Timing for Conda: https://repo.continuum.io/pkgs/free/, DNS: 0.1989 sec, LOAD: 0.4561 sec.

Package used (Python/R/Scala/Julia):
I'm using Python 3.7

For Scala user, please provide:

  1. Java version: (java -version)
  2. Maven version: (mvn -version)
  3. Scala runtime if applicable: (scala -version)

For R user, please provide R sessionInfo():

Build info (Required if built from source)

Compiler (gcc/clang/mingw/visual studio):

MXNet commit hash:
fatal: Not a git repository (or any of the parent directories): .git

Build config:
(Paste the content of config.mk, or the build command.)

Error Message:

[19:34:40] src/nnvm/legacy_json_util.cc:209: Loading symbol saved by previous version v1.3.0. Attempting to upgrade...
Traceback (most recent call last):
File "mxnet2onnx.py", line 10, in
converted_model_path = onnx_mxnet.export_model(sym, params, [input_shape], np.float32, onnx_file)
File "/home/qiuzhewei/anaconda3/lib/python3.7/site-packages/mxnet/contrib/onnx/mx2onnx/export_model.py", line 80, in export_model
sym_obj, params_obj = load_module(sym, params)
File "/home/qiuzhewei/anaconda3/lib/python3.7/site-packages/mxnet/contrib/onnx/mx2onnx/_export_helper.py", line 58, in load_module
sym, arg_params, aux_params = mx.model.load_checkpoint(model_name, num_epochs)
File "/home/qiuzhewei/anaconda3/lib/python3.7/site-packages/mxnet/model.py", line 450, in load_checkpoint
symbol = sym.load('%s-symbol.json' % prefix)
File "/home/qiuzhewei/anaconda3/lib/python3.7/site-packages/mxnet/symbol/symbol.py", line 2728, in load
check_call(_LIB.MXSymbolCreateFromFile(c_str(fname), ctypes.byref(handle)))
File "/home/qiuzhewei/anaconda3/lib/python3.7/site-packages/mxnet/base.py", line 253, in check_call
raise MXNetError(py_str(_LIB.MXGetLastError()))
mxnet.base.MXNetError: Cannot find argument 'mode', Possible Arguments:

axis : int, optional, default='-1'
The axis along which to compute softmax.
temperature : double or None, optional, default=None
Temperature parameter in softmax
dtype : {None, 'float16', 'float32', 'float64'},optional, default='None'
DType of the output in case this can't be inferred. Defaults to the same as input's dtype if not defined (dtype=None).
, in operator softmax(name="face_rpn_cls_prob_stride32", mode="channel")

Minimum reproducible example

(If you are using your own code, please provide a short script that reproduces the error. Otherwise, please provide link to the existing example.)
I am using the code from http://mxnet.incubator.apache.org/versions/master/tutorials/onnx/export_mxnet_to_onnx.html

Steps to reproduce

(Paste the commands you ran that produced the error.)

1.Just run "python mxnet2onnx.py" where mxnet2onnx.py comes from the website above
2.

What have you tried to solve it?

  1. Tried to use another mxnet model(Resnet 50) to see if it works
@mxnet-label-bot
Copy link
Contributor

Hey, this is the MXNet Label Bot.
Thank you for submitting the issue! I will try and suggest some labels so that the appropriate MXNet community members can help resolve it.
Here are my recommended labels: ONNX, Bug

@vandanavk
Copy link
Contributor

vandanavk commented Aug 15, 2019

I don't see this issue using MXNet master and ONNX v1.3.0. Which ONNX version are you using?
Could you share the symbol and params file of your model?

@Zheweiqiu
Copy link
Author

I don't see this issue using MXNet master and ONNX v1.3.0. Which ONNX version are you using?
Could you share the symbol and params file of your model?
onnx version: 1.21.
mxnet version: 1.5.0
The model is downloaded from https://github.com/deepinsight/insightface/tree/master/RetinaFace under "RetinaFace Pretrained Models" section. There is a dropbox link with which you can directly download the model.
Thanks!

@vandanavk
Copy link
Contributor

Thanks @Zheweiqiu. I faced this error:
AttributeError: No conversion function registered for op type SoftmaxActivation yet. with the model you shared on latest MXNet and ONNX v1.3.0.
There is no export support for this operator yet. Please open a feature request

@Zheweiqiu
Copy link
Author

Thanks @vandanavk for trying out. I go this error at first. After I googled, according to this answer I replace all "SoftmaxActivation" with "softmax" in the *.json file. Then I got the error stated in the question. Do I need to re-train the model using op "softmax" instead of "SoftmaxActivation"?
Thanks!

@vandanavk
Copy link
Contributor

SoftmaxActivation has an attribute mode. Replacing SoftmaxActivation with softmax may not solve the issue. Equivalent attributes for softmax may have to be mentioned. To start with, along with replacing SoftmaxActivation with softmax in the json, try to remove the attribute mode - softmax operator will use default values for attributes in this case. ONNX export of softmax operator is available

@vandanavk
Copy link
Contributor

Thanks @vandanavk for trying out. I go this error at first. After I googled, according to this answer I replace all "SoftmaxActivation" with "softmax" in the *.json file. Then I got the error stated in the question. Do I need to re-train the model using op "softmax" instead of "SoftmaxActivation"?
Thanks!

Retraining with softmax would be the best solution, since SoftmaxActivation has been deprecated.

@Zheweiqiu
Copy link
Author

Thanks @vandanavk for trying out. I go this error at first. After I googled, according to this answer I replace all "SoftmaxActivation" with "softmax" in the *.json file. Then I got the error stated in the question. Do I need to re-train the model using op "softmax" instead of "SoftmaxActivation"?
Thanks!

Retraining with softmax would be the best solution, since SoftmaxActivation has been deprecated.

I tried remove the attribute mode for softmax operator but got the following error:
AttributeError: No conversion function registered for op type UpSampling yet.
I believe same problem will be encountered even if I retrain the model and I see this issue is still working in progress.
Thanks for your reply!

@yumaofan
Copy link

I meet the same problem. wish to solve asap, thanks for a lot.

@vandanavk
Copy link
Contributor

Thanks @vandanavk for trying out. I go this error at first. After I googled, according to this answer I replace all "SoftmaxActivation" with "softmax" in the *.json file. Then I got the error stated in the question. Do I need to re-train the model using op "softmax" instead of "SoftmaxActivation"?
Thanks!

Retraining with softmax would be the best solution, since SoftmaxActivation has been deprecated.

I tried remove the attribute mode for softmax operator but got the following error:
AttributeError: No conversion function registered for op type UpSampling yet.
I believe same problem will be encountered even if I retrain the model and I see this issue is still working in progress.
Thanks for your reply!

Support for Upsampling operator is currently in review. Operator changes in #15811 and ONNX support in #15994. You could pull in these changes and build locally to try immediately. Else, you could watch out for these 2 PRs getting merged.

@Zheweiqiu
Copy link
Author

Thanks @vandanavk for trying out. I go this error at first. After I googled, according to this answer I replace all "SoftmaxActivation" with "softmax" in the *.json file. Then I got the error stated in the question. Do I need to re-train the model using op "softmax" instead of "SoftmaxActivation"?
Thanks!

Retraining with softmax would be the best solution, since SoftmaxActivation has been deprecated.

I tried remove the attribute mode for softmax operator but got the following error:
AttributeError: No conversion function registered for op type UpSampling yet.
I believe same problem will be encountered even if I retrain the model and I see this issue is still working in progress.
Thanks for your reply!

Support for Upsampling operator is currently in review. Operator changes in #15811 and ONNX support in #15994. You could pull in these changes and build locally to try immediately. Else, you could watch out for these 2 PRs getting merged.

The mxnet was installed using pip command. Do I need to uninstall it and rebuild it from source to reflect those changes?

@luan1412167
Copy link

@vandanavk I have changed as your comment #15892 (comment). However I got error
File "mxnet_to_onnx_converter.py", line 36, in <module> converted_model_path = onnx_mxnet.export_model(sym, params, [input_shape], np.float32, onnx_file) File "/home/luandd/miniconda3/envs/luandao/lib/python3.7/site-packages/mxnet/contrib/onnx/mx2onnx/export_model.py", line 83, in export_model verbose=verbose) File "/home/luandd/miniconda3/envs/luandao/lib/python3.7/site-packages/mxnet/contrib/onnx/mx2onnx/export_onnx.py", line 211, in create_onnx_graph_proto graph_outputs = MXNetGraph.get_outputs(sym, params, in_shape, output_label) File "/home/luandd/miniconda3/envs/luandao/lib/python3.7/site-packages/mxnet/contrib/onnx/mx2onnx/export_onnx.py", line 142, in get_outputs _, out_shapes, _ = sym.infer_shape(**inputs) File "/home/luandd/miniconda3/envs/luandao/lib/python3.7/site-packages/mxnet/symbol/symbol.py", line 1076, in infer_shape res = self._infer_shape_impl(False, *args, **kwargs) File "/home/luandd/miniconda3/envs/luandao/lib/python3.7/site-packages/mxnet/symbol/symbol.py", line 1210, in _infer_shape_impl ctypes.byref(complete))) File "/home/luandd/miniconda3/envs/luandao/lib/python3.7/site-packages/mxnet/base.py", line 253, in check_call raise MXNetError(py_str(_LIB.MXGetLastError())) mxnet.base.MXNetError: Error in operator face_rpn_cls_prob_stride32: [12:34:58] src/operator/softmax_output.cc:86: Check failed: in_shape->size() == 2U (1 vs. 2) : Input:[data, label]

@Zheweiqiu
Copy link
Author

@luan1412167 Did you replace SoftmaxActivation with softmax and remove all attribute mode in your .json file?

@luan1412167
Copy link

luan1412167 commented Oct 15, 2019

@Zheweiqiu
Screenshot from 2019-10-15 09-52-46
I have converted successful but I don't know get result from what output? .I get result boundingbox is so small.
[[[[5.3017639e-04 5.4040336e-04 5.1597238e-04 ... 4.6603641e-04 5.1257212e-04 4.5140341e-04] [5.4795144e-04 5.1839353e-04 4.8552474e-04 ... 4.8594383e-04 5.6435599e-04 5.0048833e-04] [5.1728956e-04 5.0016592e-04 4.6242378e-04 ... 5.1690591e-04 5.4673775e-04 5.2698085e-04]

So can you point out result help me?

@yumaofan
Copy link

@luan1412167 You should read RetinaFace paper carefully. hah

@Zheweiqiu
Copy link
Author

@luan1412167 Your output seems to be feature vector or some intermediate output rather than bounding box. The name of output layer is "output" in mxnet and I dont think its gonna change during model conversion.

@luan1412167
Copy link

luan1412167 commented Oct 15, 2019

@Zheweiqiu, @AaronFan1992

`sym = '/home/luandd/Downloads/R50-symbol.json'
params = '/home/luandd/Downloads/R50-0000.params'
input_shape = (1,3,1920,1080)
onnx_file = '/home/luandd/CLionProjects/untitled/retinaface.onnx'

converted_model_path = onnx_mxnet.export_model(sym, params, [input_shape], np.float32, onnx_file, verbose=False)`
My code converter is above. What is wrong? Sorry because I'm first time do with it.

@Zheweiqiu
Copy link
Author

@luan1412167 I am using the same script as yours to do the conversion but I am busy with other stuff and dont get time to do the rightness verification of the converted model.

@luan1412167
Copy link

@Zheweiqiu can you share me your model?

@Zheweiqiu
Copy link
Author

@luan1412167 I am afraid of not. The company is pretty strict about this and I am not allowed to upload anything to cloud nor send any file via email. Sorry about that.

@luan1412167
Copy link

@Zheweiqiu thanks you. Can you tell me the architecture output is right?

@luan1412167
Copy link

@Zheweiqiu where did you download retinaface mxnet model?

@Chenyangzh
Copy link

Hello everyone

I am in a situation of mxnet==1.5.0 and onnx==1.6.0 and want to convert the retinaface mobile-net model to ONNX model.

I aslo met the issues mentioned above, including SoftmaxActivation op and UpSampling op. Thanks for above suggestions, I solved these two problems but met a new one which is in Crop op.

The Crop node has no 'out_shape' key word. I found a similar issue in #14881, but It's not easy to get specific h-w size in that crop node. Because mxnet Crop op allows to use the name of previous output as the input of present node.

There are also some discussions in #9885 but no conclusion.

Anyone have some suggestions? Thanks a lot.

@Chenyangzh
Copy link

Finally, I solve the issue depending on cholihao/Retinaface-caffe#4

@yangshuailc
Copy link

@Chenyangzh , I aslo met the SoftmaxActivation op and UpSampling op issues,Can you share your solution? thanks you

@mohamadHN93
Copy link

@Zheweiqiu can you share me your model?

Hi can you help me in converting this model into onnx with having a dynamic batch size and input shape?
https://drive.google.com/file/d/13_fmDpkD7IyZAP5HWbwAdmAsonFQ34vm/view?usp=sharing

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

9 participants