Skip to content
This repository has been archived by the owner on Nov 17, 2023. It is now read-only.

[Feature Request] Support fp16 for C Predict API #14159

Closed
PapaMadeleine2022 opened this issue Feb 14, 2019 · 26 comments · Fixed by #15245
Closed

[Feature Request] Support fp16 for C Predict API #14159

PapaMadeleine2022 opened this issue Feb 14, 2019 · 26 comments · Fixed by #15245
Labels

Comments

@PapaMadeleine2022
Copy link

hello, there is some materials about how to train a mxnet model with fp16,
but I can not find how to infer a batch data using fp16 by c++ api, can you give some advises?

looking forward to your reply.

@lanking520
Copy link
Member

Hi, are you planning to train on C++? Currently, we do have Python supporting float16 if you initialize the dtype to that.

@lanking520 lanking520 added Question C++ Related to C++ labels Feb 14, 2019
@simonmaurer
Copy link

interested as well.
@lanking520 so what you're saying is that C++ inference using float16 is not yet possible? is that also true for the C predict API?

@lanking520
Copy link
Member

Grab someone may know about C++: @leleamol here

@PapaMadeleine2022
Copy link
Author

@lanking520 I use python to train a mxnet model (such as resnet-50 ) with --dtype=float16,but I want to using C++ api to infer this trained fp16 model. So there are some documents about how to do it?

@eric-haibin-lin
Copy link
Member

@leleamol

@PapaMadeleine2022
Copy link
Author

I use example/image-classification/predict-cpp/image-classification-predict.cc to infer this trained fp16 model, but it can not work and shows error:

Segmentation fault: 11

Stack trace returned 8 entries:
[bt] (0) /xxx/libmxnet.so(+0x431b7a) [0x7f19e23f6b7a]
[bt] (1) /xxx/libmxnet.so(+0x3503096) [0x7f19e54c8096]
[bt] (2) /lib64/libc.so.6(+0x35270) [0x7f19e1181270]
[bt] (3) /xxx/libmxnet.so(MXPredSetInput+0x74) [0x7f19e4c08d64]
[bt] (4) ./image-classification-predict() [0x403fb3]
[bt] (5) ./image-classification-predict() [0x402b8a]
[bt] (6) /lib64/libc.so.6(__libc_start_main+0xf5) [0x7f19e116dc05]
[bt] (7) ./image-classification-predict() [0x402e98]

@kwanking
Copy link

have you solved it?

@lanking520 I use python to train a mxnet model (such as resnet-50 ) with --dtype=float16,but I want to using C++ api to infer this trained fp16 model. So there are some documents about how to do it?

@lanking520 I use python to train a mxnet model (such as resnet-50 ) with --dtype=float16,but I want to using C++ api to infer this trained fp16 model. So there are some documents about how to do it?

@leleamol
Copy link
Contributor

@IvyGongoogle
Currently cpp-package APIs doesn't support fp16 datatype. We can mark this question as feature request.

@simonmaurer
Copy link

simonmaurer commented Feb 18, 2019

@IvyGongoogle
Currently cpp-package APIs doesn't support fp16 datatype. We can mark this question as feature request.

@leleamol so @IvyGongoogle did use an example which uses <mxnet/c_predict_api.h> but not cpp-package API with <mxnet-cpp/MxNetCpp.h>.
those are two different APIs, correct?

@ThomasDelteil ThomasDelteil changed the title how to infer using fp16 for c++ api [Feature Request] Support fp16 for c++ api Apr 12, 2019
@dmidge8
Copy link

dmidge8 commented Apr 12, 2019

Indeed, I concur, there is a need for fp16 computation, as shown in this forum thread. https://discuss.mxnet.io/t/network-in-float16/3710

@anirudh2290
Copy link
Member

Hi, Once the conversion pass is added in the backend this should be easy to do. I am currently working on this: #14584 . Stay tuned.

@simonmaurer
Copy link

@anirudh2290 thanks a lot for the effort. very interested as well in float16 inference in C++ API

@PapaMadeleine2022
Copy link
Author

PapaMadeleine2022 commented Apr 16, 2019

I modify the src/c_api/c_predict_api.cc in L211 to:

std::vector<NDArray> arg_arrays, aux_arrays;
 for (size_t i = 0; i < arg_shapes.size(); ++i) {
   if (arg_params.count(arg_names[i]) != 0) 
   {
     NDArray nd = NDArray(arg_shapes[i], ctx,false,arg_params[arg_names[i]].dtype());
     CopyFromTo(arg_params[arg_names[i]], &nd);
     arg_arrays.push_back(nd);
   }
   else
   {
     NDArray nd = NDArray(arg_shapes[i], ctx);
     arg_arrays.push_back(nd);
   }
 }
 for (size_t i = 0; i < aux_shapes.size(); ++i) {

   if (aux_params.count(aux_names[i]) != 0)
   {
     NDArray nd = NDArray(aux_shapes[i], ctx,false,aux_params[aux_names[i]].dtype());   
     CopyFromTo(aux_params[aux_names[i]], &nd);
     aux_arrays.push_back(nd);
   }
   else
   {
       NDArray nd = NDArray(aux_shapes[i], ctx);  
       aux_arrays.push_back(nd);
    }
 }

Then I can successfully infer this trained fp16 model using C++ api, and the speed is twice faster then with the fp32 when I infer a resentv1-50 cnn model . But when I infer a ocr recognition model, I find the speed is same with the fp32. What causes this ?

@anirudh2290
Copy link
Member

@IvyGongoogle Are your inputs and weights in fp16. Your change should work to run fp16 inference. What batch size are you using and what is the model ? For smaller batch sizes you may not see a big speedup. Also what hardware are you running it on ?

@simonmaurer
Copy link

simonmaurer commented Apr 16, 2019

@IvyGongoogle @anirudh2290 just to make sure are we talking about the real C++ API as documented in https://github.com/apache/incubator-mxnet/tree/master/cpp-package or https://github.com/apache/incubator-mxnet/tree/master/src/c_api ? Could you guys maybe elaborate a bit on the backend process ? I'm missing the point regarding why this API does not support FP16 as the whole backend of MXNet (Python being bound to this) is based on C/C++... very appreciated

@anirudh2290
Copy link
Member

The backend of MXNet is exposed via C APIs for different modules like ndarray, executor, symbol, dependency engine etc. The frontend bindings implement wrappers which internally call the C API to support different frontends. CPP-Package is a frontend binding just like Python. Thus once the support is added in backend you should have the C API for frontends to call, but the frontend interface for example amp.convert_model mentioned here: #14584 has to be implemented.

@simonmaurer
Copy link

@anirudh2290 why does the float16 inference still work with Python if that is exactly a wrapper to the C API? as seen on https://mxnet.incubator.apache.org/versions/master/faq/float16.html

@anirudh2290
Copy link
Member

if you see the mixed precision doc it asks you to modify symbol code and add cast layers at the start and before softmax. This probably can be done in another frontends too if you are writing the model by yourself. But in most cases you are loading a pre-trained model and if you have a pre-trained model there is no easy way to do this. Also, it gets more complicated when you want to introduce different cast layers not just at start and before softmax but in other places in the computation graph, and you want to customize it to test accuracy. (See the PR on AMP here: #14173 , it has specific customizable lists for ops to run in fp16 or fp32 https://github.com/apache/incubator-mxnet/pull/14173/files#diff-b79bfa3e02355c43ca5b195ef67172a5)

@KellenSunderland
Copy link
Contributor

I can give a quick summary of my experience with fp16 and C++ so far:

I believe the best way to do this is as in other front-ends such as python. As @anirudh2290 mentions, add casts around numerically sensitive and insensitive sections (batchnorms and softmaxs for example should be fp32, convs and FC should be fp16) and then make sure your inputs are in fp16. You should then be able to run inference as normal (and you should see that fp16 operations are properly running).

One caveat is that depending on your graph the time spent casting inputs may be more than the time you save using fp16. That's where AMP and TensorRT integration can help. They'll fuse many operators that are numerically sensitive, removing them from the computation graph, which means you'll get larger sections of a graph that you can run in fp16 mode. They'll also fuse casting operations into numerical operations which saves you from doing two full memory copies on your tensors when casting. These methods should be a much more practical way of running fp16 for inference (with C++).

@PapaMadeleine2022
Copy link
Author

PapaMadeleine2022 commented Apr 23, 2019

@IvyGongoogle Are your inputs and weights in fp16. Your change should work to run fp16 inference. What batch size are you using and what is the model ? For smaller batch sizes you may not see a big speedup. Also what hardware are you running it on ?

@anirudh2290 @KellenSunderland Sorry, my test results show that when using fp16, the speed is twice faster then with the fp32 when inference a resentv1-50 cnn model, but not works with ocr recognition model. I have updated the above comment. But if you have experiences with ocr inference using mxnet fp16, please give me some advises.

@PapaMadeleine2022
Copy link
Author

@IvyGongoogle
Currently cpp-package APIs doesn't support fp16 datatype. We can mark this question as feature request.

@leleamol so @IvyGongoogle did use an example which uses <mxnet/c_predict_api.h> but not cpp-package API with <mxnet-cpp/MxNetCpp.h>.
those are two different APIs, correct?

@leleamol those are two different APIs, correct?

@ziyuang
Copy link

ziyuang commented May 19, 2019

I can give a quick summary of my experience with fp16 and C++ so far:

I believe the best way to do this is as in other front-ends such as python. As @anirudh2290 mentions, add casts around numerically sensitive and insensitive sections (batchnorms and softmaxs for example should be fp32, convs and FC should be fp16) and then make sure your inputs are in fp16. You should then be able to run inference as normal (and you should see that fp16 operations are properly running).

One caveat is that depending on your graph the time spent casting inputs may be more than the time you save using fp16. That's where AMP and TensorRT integration can help. They'll fuse many operators that are numerically sensitive, removing them from the computation graph, which means you'll get larger sections of a graph that you can run in fp16 mode. They'll also fuse casting operations into numerical operations which saves you from doing two full memory copies on your tensors when casting. These methods should be a much more practical way of running fp16 for inference (with C++).

But TensorRT doesn't support dynamic/variable input shape, right?

@anirudh2290 anirudh2290 changed the title [Feature Request] Support fp16 for c++ api [Feature Request] Support fp16 for C Predict API May 21, 2019
@xizi
Copy link

xizi commented Jun 13, 2019

@IvyGongoogle when i change c_predict_api.cc as you mentioned, i find the inference result is different with python's. what is the resason.

@PapaMadeleine2022
Copy link
Author

@xizi I can get the same result. Please try it more.

@xizi
Copy link

xizi commented Jun 13, 2019

@IvyGongoogle , this is the python inference result
[ 1.7029e-01 9.7559e-01 -4.3884e-02 8.8672e-01 -2.3398e+00 -2.2109e+00
-9.5850e-01 4.2109e+00 -1.3535e+00 -1.3477e+00 -6.9199e-03 -4.2148e+00
-2.0488e+00 -2.4146e-01 -1.0615e+00 -6.4844e-01 -1.1787e+00 1.8887e+00
-4.1094e+00 1.7754e+00 2.9844e+00 -2.9346e-01 -8.5254e-01 -1.0674e+00
1.8730e+00 1.3301e+00 1.8535e+00 -1.5693e+00 -8.5938e-01 -2.1992e+00
1.6904e+00 7.6611e-01 2.9102e-01 7.2461e-01 3.1680e+00 5.1367e-01
1.3994e+00 2.3828e+00 -4.4922e+00 4.1875e+00 -8.0371e-01 -2.0410e+00
4.7925e-01 -4.1504e-01 8.9746e-01 1.9219e+00 -5.3027e-01 2.9062e+00
-2.5391e+00 8.5254e-01 2.2930e+00 5.0293e-01 1.8984e+00 3.5767e-01
-2.0557e-01 -9.0381e-01 -3.0156e+00 -3.7168e+00 -8.3862e-02 1.0410e+00
-6.7139e-01 -9.5508e-01 -9.5508e-01 -3.1172e+00 5.2305e+00 1.7939e+00
-3.7646e-01 -3.0945e-02 -4.2310e-01 1.5698e-01 -9.9609e-01 -1.2090e+00
-1.5508e+00 3.2969e+00 1.8896e+00 1.4746e+00 1.9453e+00 2.9102e+00
1.7041e+00 -2.8662e-01 3.2861e-01 1.7734e+00 -2.3359e+00 -1.7051e+00
6.5527e-01 -5.4199e-01 2.0520e-01 -1.8311e+00 -2.3789e+00 -1.2129e+00
-1.0625e+00 -5.2031e+00 6.1816e-01 -7.5635e-01 -1.0088e+00 2.8271e-01
-1.3457e+00 -4.0273e+00 2.2031e+00 4.5752e-01 -2.6758e+00 -2.2207e+00
-1.6758e+00 5.0488e-01 -1.5732e+00 2.0078e+00 2.9766e+00 -1.6338e+00
-1.4238e+00 1.8584e+00 -7.9041e-02 -2.2461e+00 -1.0068e+00 -9.0967e-01
2.8979e-01 -7.6367e-01 1.3994e+00 3.1172e+00 1.8242e+00 -9.0771e-01
4.2822e-01 -1.9395e+00 1.9092e-01 -3.3374e-01 1.9961e+00 2.4844e+00
2.8867e+00 1.9180e+00 1.6406e+00 1.8689e-01 1.4629e+00 -7.2949e-01
2.2695e+00 -1.2832e+00 3.3438e+00 3.1758e+00 5.9844e+00 2.8691e+00
-1.5684e+00 -5.6885e-01 -1.7871e-01 2.1191e+00 -2.3105e+00 1.6543e+00
-6.2207e-01 -3.3438e+00 2.6719e+00 2.8242e+00 -1.1992e+00 -1.9121e+00
2.2832e+00 1.8574e+00 4.0308e-01 4.9170e-01 -7.9102e-01 1.1768e-01
-3.5176e+00 2.0664e+00 1.5918e+00 1.5449e+00 -1.7061e+00 -4.3262e-01
-2.4258e+00 -1.7998e+00 -3.1738e-01 -1.5869e+00 -8.6084e-01 2.5742e+00
2.5391e+00 -4.1284e-01 -6.3086e-01 1.2822e+00 2.8555e+00 -5.5811e-01
-6.5527e-01 7.6709e-01 2.0410e+00 1.3887e+00 -2.3652e+00 -3.1372e-01
-1.4316e+00 -1.5352e+00 3.3047e+00 -1.4980e+00 -3.5938e-01 1.7910e+00
5.0244e-01 2.1602e+00 -1.7871e-01 -2.6699e+00 2.9834e-01 3.0957e+00
-3.0737e-01 -8.9062e-01 -3.5664e+00 2.0840e+00 -3.7246e+00 -1.9258e+00
3.9336e+00 1.5107e+00 9.7885e-03 -2.5625e+00 -2.0234e+00 -2.2363e+00
7.6074e-01 -6.7871e-01 -9.3604e-01 2.6562e+00 -1.3643e+00 5.7471e-01
5.3828e+00 2.0586e+00 2.6709e-01 -2.4863e+00 -3.1934e-01 1.1367e+00
4.5020e-01 1.5410e+00 1.4395e+00 -6.0156e-01 2.7930e+00 9.3262e-01
1.3457e+00 2.5664e+00 -2.3499e-01 -1.7168e+00 1.7285e+00 -4.0156e+00
-1.1162e+00 1.8203e+00 -3.5586e+00 5.2656e+00 -4.1797e+00 1.9111e+00
1.8477e+00 -3.0625e+00 1.2461e+00 4.2285e-01 4.2432e-01 -2.7148e+00
2.0664e+00 3.5820e+00 -7.6050e-02 -2.0547e+00 2.4487e-01 -1.5881e-01
7.6367e-01 1.6250e+00 -2.1367e+00 -2.0508e+00 -2.6172e-01 1.6333e-01
1.5371e+00 -7.9883e-01 -2.2500e+00 -8.2520e-02 -3.0781e+00 -1.3652e+00
-1.3496e+00 -3.6367e+00 -9.4775e-01 2.8633e+00 -3.2397e-01 6.5430e-01
1.7695e+00 -3.3457e+00 2.2480e+00 2.5664e+00 -2.1406e+00 -1.2676e+00
2.9297e-01 4.2617e+00 -3.1465e+00 -3.1309e+00 3.7256e-01 4.8901e-01
1.0391e+00 -4.6953e+00 -2.0879e+00 2.0742e+00 2.5879e-02 3.3740e-01
2.9541e-01 -2.6147e-01 -1.3047e+00 -2.8926e+00 1.1211e+00 -6.2988e-01
1.1755e-01 1.0284e-01 -2.5273e+00 1.0703e+00 -3.4336e+00 -1.6836e+00
-2.7305e+00 -4.8867e+00 -1.9150e+00 -3.2109e+00 1.2529e+00 1.6484e+00
2.0898e+00 1.4492e+00 -2.0820e+00 1.8047e+00 -1.3301e+00 -9.7217e-01
2.1914e+00 -1.5195e+00 1.4443e+00 5.2344e+00 1.3447e+00 8.6121e-02
1.4121e+00 -1.1270e+00 -2.2305e+00 1.1367e+00 1.2520e+00 -3.7539e+00
3.5571e-01 -1.1211e+00 -1.1633e-01 8.1592e-01 3.2080e-01 7.0703e-01
1.6338e+00 1.8274e-01 -4.5156e+00 2.6123e-02 9.7266e-01 3.7891e-01
5.2393e-01 -2.7422e+00 -2.7793e+00 -1.0425e-01 1.1172e+00 -3.5488e+00
2.9375e+00 3.8354e-01 -2.0156e+00 3.7659e-02 -1.3184e+00 9.4336e-01
4.0405e-01 -2.7285e+00 -4.2734e+00 -1.5781e+00 2.7417e-01 1.1328e+00
6.7871e-01 2.0449e+00 8.9258e-01 4.1650e-01 -1.2622e-01 -1.8574e+00
1.2441e+00 1.1992e+00 -1.9395e+00 -4.2236e-01 -1.9854e+00 -1.7285e+00
1.6982e+00 2.7891e+00 2.8870e-02 -1.4785e+00 2.6367e+00 -2.8442e-01
-8.6035e-01 6.4062e-01 -1.2695e+00 1.0880e-02 2.4590e+00 4.9780e-01
1.3613e+00 -6.0449e-01 -1.5918e+00 9.9023e-01 1.6626e-01 1.1602e+00
-8.4326e-01 2.1270e+00 1.1172e+00 1.7549e+00 -3.7231e-03 5.4297e-01
1.0879e+00 7.9590e-01 -2.2695e+00 -1.2480e+00 3.0410e+00 1.6284e-01
1.3408e+00 4.6143e-01 -9.9365e-02 4.7656e-01 3.9023e+00 -3.7930e+00
-9.7656e-01 1.7422e+00 1.4951e+00 -4.0469e+00 -5.8899e-02 1.7480e-01
2.1875e+00 -3.0859e+00 -1.4150e+00 1.1406e+00 2.5879e+00 -3.1519e-01
-9.0967e-01 7.0605e-01 4.0117e+00 7.4805e-01 3.0176e-01 1.8408e-01
1.0706e-01 1.1816e+00 -1.5371e+00 6.3965e-01 3.5498e-01 7.5146e-01
-3.2891e+00 -3.1445e+00 -2.3086e+00 2.0820e+00 2.4648e+00 -3.5312e+00
4.3828e+00 5.8594e-01 -5.5420e-01 -4.1797e+00 -3.3398e+00 -4.0210e-01
-7.5391e-01 3.7891e-01 1.6387e+00 -6.4453e-01 -2.6641e+00 -2.4648e+00
2.2598e+00 3.3374e-01 -1.0547e+00 -1.9121e+00 -3.7598e+00 1.1729e+00
-3.2500e+00 -5.5000e+00 -9.2529e-02 3.5059e-01 4.5068e-01 4.3457e-01
1.1602e+00 -1.9775e+00 -3.7383e+00 -6.8115e-01 3.1982e-02 1.5879e+00
-2.2402e+00 -1.3926e+00 1.7734e+00 -5.0751e-02 -7.3486e-01 -2.8438e+00
-7.7197e-01 8.0957e-01 -1.9395e+00 6.3770e-01 1.2637e+00 -1.0869e+00
4.7534e-01 -2.3691e+00 -3.8086e-01 3.4375e+00 -2.8398e+00 -1.8770e+00
1.1016e+00 1.0273e+00 1.1582e+00 -1.1660e+00 -2.6816e+00 2.0176e+00
-3.0957e-01 -2.9590e-01 2.4023e+00 -9.6045e-01 -3.3477e+00 1.4424e+00
4.2227e+00 2.7161e-02 -4.1992e-02 -8.7988e-01 3.2617e+00 1.1426e+00
-4.6602e+00 5.7520e-01 -2.5508e+00 4.5664e+00 -3.2495e-01 -7.4365e-01
-3.9023e+00 -2.2949e+00 9.9561e-01 2.1252e-01 -2.1367e+00 -1.7480e+00
-1.5410e+00 3.0078e+00 3.4668e+00 -2.6055e+00 -4.8975e-01 1.3730e+00
3.2446e-01 2.7148e+00]
dtype = float16
and this is image-classification-predict inference result
0.000013,0.000044,-0.009322,1.615004,-0.000059,-3.119395,-0.000000,-0.000005,0.000937,0.002605,-0.000000,-0.000003,0.000271,-0.000141,-0.049004,0.000001,0.000000,-0.000000,0.026151,3.339894,-0.012799,-0.000000,0.015894,0.327544,0.000193,0.000000,0.000000,-0.000350,-0.427234,0.000004,-0.000001,-0.194065,0.005348,-0.000000,0.000000,-0.000128,1.294744,0.000653,0.059628,-0.000000,0.000055,-0.001318,-0.000030,-0.001524,-0.000453,-12.919677,-0.000000,0.000000,-2.808351,0.000000,-0.005180,0.000001,0.013350,-0.000512,0.001676,-0.048019,-0.000305,-0.000004,0.111437,-0.000053,-0.014478,-0.000002,0.014663,0.008560,-0.000000,-0.000009,-0.000886,0.856413,0.027620,-0.000001,0.003833,0.001806,-0.178410,0.097775,-0.010481,0.049373,0.000000,-0.000000,0.004326,0.000489,-0.000000,-0.000105,-0.000016,0.028895,-0.000001,0.000077,-0.000037,0.000001,0.000203,-0.000000,-0.001013,-0.000266,0.004721,0.010571,-0.057050,0.063094,-0.000000,0.006828,-0.007858,0.000098,-0.055212,-0.035336,-0.000021,0.060235,0.000002,0.008377,-0.024314,0.000084,0.000115,-0.000004,0.000000,0.041070,-0.010632,-2.519089,0.004996,11.171617,0.002041,-0.053767,0.000002,-0.077247,0.498505,-0.010843,-0.000001,0.001779,-0.010665,0.000000,-0.000000,0.000000,-0.000655,-0.862129,0.311977,0.000000,-2.222350,0.039121,-0.000157,2.533988,-0.180412,0.000028,-6.163296,0.017058,0.000000,-0.000000,-0.057794,-0.000001,0.000000,0.000031,-0.006553,-7.179364,-0.175522,0.001100,0.000057,0.002086,-0.000053,-0.000152,14.577077,-0.000000,-0.000059,0.000045,-1.345385,-0.000113,0.000000,0.000002,0.000000,0.000007,0.000000,-0.112623,-0.000000,-0.404709,0.000000,-0.000000,0.000079,-0.107033,-0.000211,0.000328,0.001497,0.000000,-0.001711,0.000002,-0.000000,-0.000865,0.060604,-0.000057,-0.000000,0.000051,-0.000000,0.000000,-0.000001,0.000050,0.000018,0.048512,0.003864,0.000003,0.000000,-0.000777,-0.000000,0.000104,0.000053,-0.314945,0.000808,-0.985275,0.000000,-0.250461,0.000394,-0.000003,0.000000,0.000012,0.000007,0.000013,0.000004,0.000046,-0.522430,0.012618,-0.957957,0.000000,-2.964441,-0.000001,0.000000,-0.000001,-0.013290,0.000000,-0.004478,0.001547,-15.546555,0.000104,0.000000,-0.012282,-0.000001,0.000287,-0.000115,-0.000000,-0.029748,0.000000,0.000016,-0.000179,-0.083103,1.357006,-0.005668,0.000001,-0.000056,0.002910,-0.000000,-0.000051,0.000164,0.000000,-0.000062,0.000042,0.000026,2.527154,-0.000000,-0.032165,-0.000000,-0.000808,0.639562,-0.045714,0.003566,0.075770,0.000040,0.000001,-0.023527,4.209774,-0.000823,-2.626975,-0.000000,-0.000000,0.012862,0.002148,-0.000000,-0.000217,0.000142,-0.003849,-0.002162,0.000006,0.000016,0.000059,0.021756,2.183603,-0.005485,-0.000000,0.000821,0.030237,0.000000,0.000002,0.000000,-0.000000,-2.027202,0.000149,-0.000121,-0.172568,0.001307,0.000000,0.000000,-0.000080,0.068226,0.000313,0.172101,-0.000000,0.027611,-0.001913,-0.000000,-0.006781,-0.000016,-9.732667,-0.000010,0.000000,-1.427412,0.000001,-0.034848,0.000000,0.003399,-0.002216,0.007857,-0.003665,-0.000000,-0.000001,0.295366,-0.000002,-0.001321,0.000000,0.063103,0.002903,0.000005,-0.000000,-0.000024,0.053527,0.297375,-0.000000,0.023157,0.000808,-0.659049,0.064080,-0.001875,0.000048,0.000007,0.000000,0.015609,0.000852,-0.000000,-0.024016,-0.015178,0.054863,-0.000000,0.000312,-0.000000,0.000005,0.000382,-0.000000,-0.000375,-0.000939,0.001401,0.015210,-0.048261,0.438854,-0.000095,0.014206,-0.003643,0.003056,-0.027177,-0.006187,-0.000000,0.041188,0.000000,0.010880,-0.041551,0.000055,0.003452,-0.000000,0.000158,0.037897,-0.000213,-1.626848,0.001798,13.546704,0.009995,-0.451626,0.000000,-0.046434,0.836856,-0.008158,0.000000,0.000562,-0.008224,-0.000000,-0.000032,-0.000000,-0.000091,-0.686467,0.025835,0.000001,-0.048640,0.040096,-0.000122,3.596862,-0.203852,-0.000000,-5.006491,0.004051,0.000000,-0.000000,-0.161840,-0.000000,-0.000000,0.000067,-0.000314,-7.054393,-0.323693,0.001207,0.002643,0.003551,-0.000006,-0.002109,8.951099,0.000000,-0.000070,0.000108,-0.844654,-0.000033,0.000168,0.000001,0.000000,-0.000004,0.000000,-0.035942,0.000000,-0.883696,0.000000,0.000000,0.000002,-0.034715,-0.002956,0.000006,0.024929,0.000000,-0.007712,0.001337,-0.000000,-0.003750,0.085072,-0.003406,-0.000000,0.000000,0.000000,0.000001,-0.000000,0.000011,0.000244,0.000953,0.001062,0.000000,0.000071,-0.000014,0.000000,-0.000000,-0.000000,-3.081950,0.004874,-3.409726,0.000000,-0.103631,0.000004,0.000000,0.000010,0.000000,-0.000000,0.000380,0.000000,0.000000,-0.069701,0.008285,-0.340310,0.000013,-2.502913,-0.000000,0.000000,-0.000000,-0.063350,0.000000,-0.005820,0.000001,-17.092924,-0.000000,0.000000,-0.002872,-0.000001,0.002414,-0.000682,0.000000,-0.234103,0.000092,0.000000,-0.000007,-0.004859,0.144233,-0.003063,0.000275,-0.000109,0.014938,-0.000000,-0.000005,0.000838,-0.000000,-0.000001,0.000119,0.000000,8.108716,-0.000015,-0.010483,0.000001,-0.005027,0.028043,-0.043517,0.000009,0.047657
the feature dim is 512.

@xizi
Copy link

xizi commented Jun 13, 2019

@IvyGongoogle i have solved it by add a cast layer after the final ouput layer.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.