-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Mack-RCNN C++ Deployment Not Working in GPU Mode And CPU Mode #15207
Comments
Hey, this is the MXNet Label Bot. |
@zheshipinyinMc For CPU you can try disable MKLDNN in your build see if it works. For GPU, it's possible that your model may work properly in python imperative mode since network can be inferenced section by section, but in C++ it will allocate all the memory once before execution and you only have 6G gpu memory. |
@zheshipinyinMc this issue in MKLDNN backend should be fixed by #15038 . |
@zhreshold test on server with GPU, the image(1000w591h) needs about 10G memory,the image(500w295h) needs about 6G memory.And everything is ok with CPU mode. But i resize the image(150*150),it is still not working on my computer. |
@zhreshold I just make may incubator-mxnet again with command 'make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CPP_PACKAGE=1 USE_CUDA=1 USE_MKLDNN=0 USE_GPERFTOOLS=1 USE_CUDNN=1 USE_CUDA_PATH=/usr/local/cuda', then the demo can work normally in CPU mode,but is cost 107343 ms (600w655h)(105212ms--->137w150h ). |
@zhreshold the scores is 1x1x1000 , we can get score by scores.At(0,0, i) |
Please build with USE_MKLDNN=1 USE_GPERFTOOLS=0 |
@pengzhao-intel I will try this. Another question: the scores is 1x1x1000 , we can get score by scores.At(0,0, i) |
@xinyu-intel to help you for this question :) |
const mx_float *mask_ptr = exec->outputs[3].GetData();
// calculate offset and access the elements |
@zhreshold thanks,but i found that mask values of python deployment and c++ deployment are different.And detected bboxes also have a little deviation。 |
might due to different input values |
maybe.And how to get middle layer output from gluoncv model.In mxnet model we can get middle layer output like this,just change all_layers[]: |
@pengzhao-intel same error with USE_MKLDNN=1 USE_GPERFTOOLS=0 |
@zheshipinyinMc which version of mxnet are you using and can you please give the reproduce method? |
@xinyu-intel |
please try |
@xinyu-intel thanks.How about the "NDArray::At(size_t c, size_t h, size_t w)" vs "NDArray(index1,index2,index3,index4)". |
1、Train Mask R-CNN with COCO dataset.
2、Test saved model in python is ok.
3、Deploy mask R-CNN with gluoncv c++ deployment, the model is not working in GPU mode and CPU mode.
MXNet: 1.4
System: Ubuntu 16.04
Gluon CV: 0.4.0
errors:
In GPU mode, the error is "incubator-mxnet/cpp-package/include/mxnet-cpp/ndarray.hpp:242: Check failed: MXNDArrayWaitAll() == 0 (-1 vs. 0) : [08:43:52] src/storage/./pooled_storage_manager.h:157: cudaMalloc failed: out of memory".
In CPU mode, the error is "incubator-mxnet/cpp-package/include/mxnet-cpp/ndarray.hpp:242: Check failed: MXNDArrayWaitAll() == 0 (-1 vs. 0) : [08:46:48] src/ndarray/ndarray.cc:752: Check failed: !IsMKLDNNData(): We can't generate TBlob for MKLDNN data. Please use Reorder2Default() to generate a new NDArray first".
My GPU has 6G memory, CPU has 32G memory.
incubator-mxnet make command :
"make -j $(nproc) USE_OPENCV=1 USE_BLAS=openblas USE_CPP_PACKAGE=1 USE_CUDA=1 USE_MKLDNN=1 USE_CUDNN=1 USE_CUDA_PATH=/usr/local/cuda"
But gluoncv yolov3 is working in GPU mode and CPU mode.
@zhreshold
The text was updated successfully, but these errors were encountered: