Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Problems while running the demo Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0) CUDNN_STATUS_NOT_INITIALIZED #1527

Closed
yuhengWaaaaaaaaang opened this issue Mar 10, 2020 · 29 comments

Comments

@yuhengWaaaaaaaaang
Copy link

I have succefully compile the openpose with CMake, but there is an error happened in Netcaffe.cpp

upImpl->upCaffeNet.reset(new caffe::Net{upImpl->mCaffeProto, caffe::TEST});

and the error shows Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0) CUDNN_STATUS_NOT_INITIALIZED

CUDA v10.2
CUDNN 7.6.5
CMake 3.17
Visual Studio 2017

@jmguerreroh
Copy link

I have the same problem running the example:

./build/examples/openpose/openpose.bin -hand -face
Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
Auto-detecting camera index... Detected and opened camera 0.
Auto-detecting all available GPUs... Detected 1 GPU(s), 
using 1 of them starting at GPU 0.
F0319 16:51:17.117803 27983 cudnn_relu_layer.cpp:13] 
Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0)  CUDNN_STATUS_NOT_INITIALIZED
*** Check failure stack trace: ***
    @     0x7f2d018910cd  google::LogMessage::Fail()
    @     0x7f2d01892f33  google::LogMessage::SendToLog()
    @     0x7f2d01890c28  google::LogMessage::Flush()
    @     0x7f2d01893999  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f2d012756d0  caffe::CuDNNReLULayer<>::LayerSetUp()
    @     0x7f2d01367365  caffe::Net<>::Init()
    @     0x7f2d01369487  caffe::Net<>::Net()
    @     0x7f2d02c9213a  op::NetCaffe::initializationOnThread()
    @     0x7f2d02c25a20  op::FaceExtractorCaffe::netInitializationOnThread()
    @     0x7f2d02c27653  op::FaceExtractorNet::initializationOnThread()
    @     0x7f2d02cf14e1  op::Worker<>::initializationOnThreadNoException()
    @     0x7f2d02cf1610  op::SubThread<>::initializationOnThread()
    @     0x7f2d02cf3968  op::Thread<>::initializationOnThread()
    @     0x7f2d02cf3b37  op::Thread<>::threadFunction()
    @     0x7f2d025886ef  (unknown)
    @     0x7f2d00ea66db  start_thread
    @     0x7f2d01fe388f  clone
Aborted

Computer information:

  • Graphics Card: NVIDIA RTX2060
  • NVIDIA driver 440.59
  • Cuda 10.2
  • cuDNN 7.6.5
  • Tests for cuDNN run succesfully.

Any suggestions? If I compile using only CUDA and not cuDNN, I can run it without hands and faces options, otherwise OpenPose consumes more than 6GB (maximum for RTX 2060) and aborts (out of memory).

Thanks in advance.

@yuhengWaaaaaaaaang
Copy link
Author

I run the demo successfully by only using the CPU, but it is really slow.

@psds075
Copy link

psds075 commented Mar 22, 2020

Please use the option, --net_resolution 160x80 (or 320x176)

(ex : bin\OpenPoseDemo.exe --face --hand --net_resolution 160x80)

@jmguerreroh
Copy link

I run the demo successfully by only using the CPU, but it is really slow.

Thank you, but I need it for real-time, so CPU its not an option.

Please use the option, --net_resolution 160x80 (or 320x176)

(ex : bin\OpenPoseDemo.exe --face --hand --net_resolution 160x80)

Thank you, I tried with these net resolutions and I get an cuDNN internal error when faces or hands are activated.

./build/examples/openpose/openpose.bin --net_resolution 160x80 --hand --face
Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
VIDEOIO ERROR: V4L: index 0 is not correct!
Auto-detecting camera index... Detected and opened camera 1.
Auto-detecting all available GPUs... Detected 1 GPU(s), 
using 1 of them starting at GPU 0.
F0322 11:41:25.616219  5789 cudnn_relu_layer.cpp:13] 
Check failed: status == CUDNN_STATUS_SUCCESS (4 vs. 0)  CUDNN_STATUS_INTERNAL_ERROR
*** Check failure stack trace: ***
    @     0x7fd6af9ce0cd  google::LogMessage::Fail()
    @     0x7fd6af9cff33  google::LogMessage::SendToLog()
    @     0x7fd6af9cdc28  google::LogMessage::Flush()
    @     0x7fd6af9d0999  google::LogMessageFatal::~LogMessageFatal()
    @     0x7fd6af3b26d0  caffe::CuDNNReLULayer<>::LayerSetUp()
    @     0x7fd6af4a4365  caffe::Net<>::Init()
    @     0x7fd6af4a6487  caffe::Net<>::Net()
    @     0x7fd6b0dcf13a  op::NetCaffe::initializationOnThread()
    @     0x7fd6b0d62a20  op::FaceExtractorCaffe::netInitializationOnThread()
    @     0x7fd6b0d64653  op::FaceExtractorNet::initializationOnThread()
    @     0x7fd6b0e2e4e1  op::Worker<>::initializationOnThreadNoException()
    @     0x7fd6b0e2e610  op::SubThread<>::initializationOnThread()
    @     0x7fd6b0e30968  op::Thread<>::initializationOnThread()
    @     0x7fd6b0e30b37  op::Thread<>::threadFunction()
    @     0x7fd6b06c56ef  (unknown)
    @     0x7fd6aefe36db  start_thread
    @     0x7fd6b012088f  clone
Aborted

I forgot to say that I am trying to run OpenPose in Ubuntu 18.04 LTS

@vagi8
Copy link

vagi8 commented Apr 2, 2020

similar issue found here #1508

@jmguerreroh
Copy link

similar issue found here #1508

Thank you.

I have reinstalled Cuda and cuDNN more than a couple of times, updated all symbolic links, check all OpenPose information, even update to the latest NVIDIA Driver, and nothing works for me.

I think there is an issue with Turing architecture in RTX graphics cards, probably related to Caffe (I have tried to compile it changing Makefiles to add sm_75 and compute_75 architecture used by RTX graphics cards, but it does not work either).

@vagi8
Copy link

vagi8 commented Apr 2, 2020

similar issue found here #1508

Thank you.

I have reinstalled Cuda and cuDNN more than a couple of times, updated all symbolic links, check all OpenPose information, even update to the latest NVIDIA Driver, and nothing works for me.

I think there is an issue with Turing architecture in RTX graphics cards, probably related to Caffe (I have tried to compile it changing Makefiles to add sm_75 and compute_75 architecture used by RTX graphics cards, but it does not work either).

In your first reply i see you have used CUDA 10.2 and cuDNN 7.6.5
But have you tried it specifically with CUDA 10.0 and cuDNN 7.5.0 ?

@jmguerreroh
Copy link

similar issue found here #1508

Thank you.
I have reinstalled Cuda and cuDNN more than a couple of times, updated all symbolic links, check all OpenPose information, even update to the latest NVIDIA Driver, and nothing works for me.
I think there is an issue with Turing architecture in RTX graphics cards, probably related to Caffe (I have tried to compile it changing Makefiles to add sm_75 and compute_75 architecture used by RTX graphics cards, but it does not work either).

In your first reply i see you have used CUDA 10.2 and cuDNN 7.6.5
But have you tried it specifically with CUDA 10.0 and cuDNN 7.5.0 ?

I've tried this setup and it works fine (pretty slow at 6 fps with hands and faces), although nvidia-smi still says CUDA 10.2 is installed. But I can at least start working.

Thank you!

@yuhengWaaaaaaaaang
Copy link
Author

I have solved the problem, it turns out that my GPU only support CUDA version lower than v10.2.141, I finally successfully run the tutorial demo 01 by using CUDA v10.1.

But I come up with a new error Check failed: status == CUDNN_STATUS_SUCCESS (2 vs. 0) while running tutorial demo 02, maybe it is because the memory of my GPU can not support the code. I decide to use the CPU vesrion to start working, although it is REALLY slow

@jmguerreroh
Copy link

I have solved the problem, it turns out that my GPU only support CUDA version lower than v10.2.141, I finally successfully run the tutorial demo 01 by using CUDA v10.1.

But I come up with a new error Check failed: status == CUDNN_STATUS_SUCCESS (2 vs. 0) while running tutorial demo 02, maybe it is because the memory of my GPU can not support the code. I decide to use the CPU vesrion to start working, although it is REALLY slow

Hello,

You can solve the problem using Cuda 10.0 and cuDNN 7.5.0, try it, it worked for me.

Also, you can install both Cuda versions (I have 10.0 and 10.2, the first with cuDNN 7.5.0 and the second with cuDNN 7.6.5). Just ensure that you include the path for Cuda 10.0 before using OpenPose, otherwise, it will find Cuda 10.2 and will fail.

export PATH="/usr/local/cuda-10.0/bin:$PATH"
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

@yuhengWaaaaaaaaang
Copy link
Author

I have solved the problem, it turns out that my GPU only support CUDA version lower than v10.2.141, I finally successfully run the tutorial demo 01 by using CUDA v10.1.
But I come up with a new error Check failed: status == CUDNN_STATUS_SUCCESS (2 vs. 0) while running tutorial demo 02, maybe it is because the memory of my GPU can not support the code. I decide to use the CPU vesrion to start working, although it is REALLY slow

Hello,

You can solve the problem using Cuda 10.0 and cuDNN 7.5.0, try it, it worked for me.

Also, you can install both Cuda versions (I have 10.0 and 10.2, the first with cuDNN 7.5.0 and the second with cuDNN 7.6.5). Just ensure that you include the path for Cuda 10.0 before using OpenPose, otherwise, it will find Cuda 10.2 and will fail.

export PATH="/usr/local/cuda-10.0/bin:$PATH"
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

Well...I still get this error after using the version 10.0, maybe my GPU can not support computing of face or hand.
But thanks for your advice anyway!

@gineshidalgo99
Copy link
Member

gineshidalgo99 commented Apr 4, 2020

How big if your GPU memory? Hand/face might need >4 GB of GPU memory, that error might be an out of memory issue. (Closing assuming that's the issue, but feel free to post otherwise).

Reducing net_resolution and hand/face_net_resolution is a way to test this, but reducing hand/face resolution will result in very bad results for those parts (reducing body resolution can work depending on the videos).

EDITED (Nov 2020): Please try the latest OpenPose, I modified the CMake and Caffe versions to be compatible with CUDA 10 and 11.

@AmazingRachel
Copy link

I ran into this problem before, but now I've solved it. You can try this:

  1. When using CMAKE to compile, cancel the USE_CUDNN option.
  2. delete D:\openpose-master\build\bin\cudnn64_7.dll file.
    And my computer information are as follows:
    Graphics Card: NVIDIA RTX2070 Ultra
    Cuda 10.0
    cuDNN 7.5.0
    Cmake 3.15.0
    Hope this advice could help you guys!

@SaugatBhattarai
Copy link

SaugatBhattarai commented Jun 21, 2020

I ran into same issue when I had cudnn 7.6.* and Cuda 10.2.* in ubuntu 18.04 in RTX 2070. After I downgrade to cudnn 7.5.* and Cuda 10.0.* This issue is resolved. If you need to install cuda 10 and cudnn 7.5 follow this link

@leoluopy
Copy link

similar issue found here #1508

Thank you.

I have reinstalled Cuda and cuDNN more than a couple of times, updated all symbolic links, check all OpenPose information, even update to the latest NVIDIA Driver, and nothing works for me.

I think there is an issue with Turing architecture in RTX graphics cards, probably related to Caffe (I have tried to compile it changing Makefiles to add sm_75 and compute_75 architecture used by RTX graphics cards, but it does not work either).

i add " -gencode arch=compute_75,code=sm_75 " in my Makefile.config , the problem is solved , thanks . my environment ( cuda 10.2 cudnn 7.5)

@Ctiger96
Copy link

@leoluopy Hi, mine is Ubuntu 18.04 cuda 10.2 cudnn 7.5, and I have the same problem. But I don't find Makefile.config. In /openpose/3rdparty/caffe, there are many familiar files like Makefile.config.Ubuntu16_cuda8.example. Would you please give some details about the following operation?

i add " -gencode arch=compute_75,code=sm_75 " in my Makefile.config , the problem is solved , thanks . my environment ( cuda 10.2 cudnn 7.5)

@leoluopy
Copy link

@Ctiger96 there's a example cmake file in the caffe src directory

@Pinocchioo
Copy link
Contributor

similar issue found here #1508

Thank you.
I have reinstalled Cuda and cuDNN more than a couple of times, updated all symbolic links, check all OpenPose information, even update to the latest NVIDIA Driver, and nothing works for me.
I think there is an issue with Turing architecture in RTX graphics cards, probably related to Caffe (I have tried to compile it changing Makefiles to add sm_75 and compute_75 architecture used by RTX graphics cards, but it does not work either).

i add " -gencode arch=compute_75,code=sm_75 " in my Makefile.config , the problem is solved , thanks . my environment ( cuda 10.2 cudnn 7.5)

Hi, I ran the examples 02_whole_body_from_image.py ,and have the same problem.And I had try to add :

                -gencode arch=compute_75,code=sm_75 \
                -gencode arch=compute_75,code=compute_75

in openpose/3rdparty/caffe/Makefile.config.* .And then cmake _ make -j _ make install.But it still doesn't work.

@Pinocchioo
Copy link
Contributor

I have solved the problem, it turns out that my GPU only support CUDA version lower than v10.2.141, I finally successfully run the tutorial demo 01 by using CUDA v10.1.

But I come up with a new error Check failed: status == CUDNN_STATUS_SUCCESS (2 vs. 0) while running tutorial demo 02, maybe it is because the memory of my GPU can not support the code. I decide to use the CPU vesrion to start working, although it is REALLY slow

Hi, I ran the examples 02_whole_body_from_image.py ,and have the same problem.And I had try to add :

            -gencode arch=compute_75,code=sm_75 \
            -gencode arch=compute_75,code=compute_75

in openpose/3rdparty/caffe/Makefile.config.* .And then cmake _ make -j _ make install.But it still doesn't work.

@trkygt
Copy link

trkygt commented Jan 23, 2021

Hi,

I have been facing the same issue when I try to run the demo file.
My configuration is as follows:
OS: Ubuntu 20.04
GPU: GTX1060 (3GB) with ARCH 6.1
CUDA 11.2 - CUDNN 8.0.5

I followed the steps for custom Caffe and OpenCV when compiling (BUILD_CAFFE flag is off, etc.) . I tried downgrading CUDA and CUDNN as someone suggested but that didn't help either. I had to reinstall my OS at some point to get a clean setup but still the issue persists.

Am I missing something or openpose doesn't support the CUDNN version? Thanks.

@gineshidalgo99
Copy link
Member

For Windows, this is my Caffe repo to compile Windows: https://github.com/gineshidalgo99/caffeCompilerForWindowsAndCUDA It is based on the Windows Caffe one.

I have not been able to compile cuDNN for Windows, it keeps giving me this error:

[Many other logs]
'OpenPoseDemo.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudnn_ops_infer64_8.dll'. Module was built without symbols.
'OpenPoseDemo.exe' (Win32): Loaded 'C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.1\bin\cudnn_cnn_infer64_8.dll'. Module was built without symbols.
F0207 11:36:55.959534  5612 cudnn_conv_layer.cpp:53] Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0)  CUDNN_STATUS_NOT_INITIALIZED
Unhandled exception at 0x00007FFFA5DD286E (ucrtbase.dll) in OpenPoseDemo.exe: Fatal program exit requested.

The program '[7240] OpenPoseDemo.exe' has exited with code 0 (0x0).

If anybody is able to get it to work without giving the CUDNN_STATUS_NOT_INITIALIZED error, I'd very highly appreciate some hints of the exact CUDA/cuDNN version and/or instructions to get it to work! :)

Please, continue this discursion in #1845, to centralize messages and hopefully focus efforts to fix the issue. Thanks!

PS: For Ubuntu users with memory issues, v1.7.0 was modified to allow cuDNN 8, which was a pain. I am not an expert, so I am sure there must be a better way to run the cuDNN convolutions using less memory, but I am not an expert on it. I am very open to suggestions about the cudnn_conv implementation to minimize memory:
https://github.com/CMU-Perceptual-Computing-Lab/caffe/blob/master/src/caffe/layers/cudnn_conv_layer.cpp

Please, continue this discursion in #1864, to centralize messages and hopefully focus efforts to fix the issue. Thanks!

@steveter
Copy link

Hello,
I've been facing this problem for two weeks. The real problem is the fact that the Cuda version doesn't match with your GPU. Before installing CUDA and CUDNN , please check the supported version of your GPU.

@redyuan43
Copy link

redyuan43 commented Mar 4, 2021

I have the same problem running the example:

./build/examples/openpose/openpose.bin -hand -face
Starting OpenPose demo...
Configuring OpenPose...
Starting thread(s)...
Auto-detecting camera index... Detected and opened camera 0.
Auto-detecting all available GPUs... Detected 1 GPU(s), 
using 1 of them starting at GPU 0.
F0319 16:51:17.117803 27983 cudnn_relu_layer.cpp:13] 
Check failed: status == CUDNN_STATUS_SUCCESS (1 vs. 0)  CUDNN_STATUS_NOT_INITIALIZED
*** Check failure stack trace: ***
    @     0x7f2d018910cd  google::LogMessage::Fail()
    @     0x7f2d01892f33  google::LogMessage::SendToLog()
    @     0x7f2d01890c28  google::LogMessage::Flush()
    @     0x7f2d01893999  google::LogMessageFatal::~LogMessageFatal()
    @     0x7f2d012756d0  caffe::CuDNNReLULayer<>::LayerSetUp()
    @     0x7f2d01367365  caffe::Net<>::Init()
    @     0x7f2d01369487  caffe::Net<>::Net()
    @     0x7f2d02c9213a  op::NetCaffe::initializationOnThread()
    @     0x7f2d02c25a20  op::FaceExtractorCaffe::netInitializationOnThread()
    @     0x7f2d02c27653  op::FaceExtractorNet::initializationOnThread()
    @     0x7f2d02cf14e1  op::Worker<>::initializationOnThreadNoException()
    @     0x7f2d02cf1610  op::SubThread<>::initializationOnThread()
    @     0x7f2d02cf3968  op::Thread<>::initializationOnThread()
    @     0x7f2d02cf3b37  op::Thread<>::threadFunction()
    @     0x7f2d025886ef  (unknown)
    @     0x7f2d00ea66db  start_thread
    @     0x7f2d01fe388f  clone
Aborted

Computer information:

* Graphics Card: NVIDIA RTX2060

* NVIDIA driver 440.59

* Cuda 10.2

* cuDNN 7.6.5

* Tests for cuDNN run succesfully.

Any suggestions? If I compile using only CUDA and not cuDNN, I can run it without hands and faces options, otherwise OpenPose consumes more than 6GB (maximum for RTX 2060) and aborts (out of memory).

Thanks in advance.

same issue too

Computer information:

* Graphics Card: NVIDIA RTX1070

* NVIDIA driver 460.56

* Cuda 11.2

* cuDNN 8.1.1

* Tests for cuDNN run succesfully.

@dreamboatcap
Copy link

I have solved the problem, it turns out that my GPU only support CUDA version lower than v10.2.141, I finally successfully run the tutorial demo 01 by using CUDA v10.1.
But I come up with a new error Check failed: status == CUDNN_STATUS_SUCCESS (2 vs. 0) while running tutorial demo 02, maybe it is because the memory of my GPU can not support the code. I decide to use the CPU vesrion to start working, although it is REALLY slow

Hello,

You can solve the problem using Cuda 10.0 and cuDNN 7.5.0, try it, it worked for me.

Also, you can install both Cuda versions (I have 10.0 and 10.2, the first with cuDNN 7.5.0 and the second with cuDNN 7.6.5). Just ensure that you include the path for Cuda 10.0 before using OpenPose, otherwise, it will find Cuda 10.2 and will fail.

export PATH="/usr/local/cuda-10.0/bin:$PATH"
export LD_LIBRARY_PATH=/usr/local/cuda-10.0/lib64${LD_LIBRARY_PATH:+:${LD_LIBRARY_PATH}}

I had the same error. I compiled openpose on Ubuntu 20.04, installed cuda 11.1.1 with cudnn 8.0.5. I changed to Ubuntu 18.04 after i saw this message. I did exactly what you are saying, installed cuda 10.0 with cudnn 7.5.0 and everything worked perfectly. Thank you.

I have GTX 1070 with Nvidia driver 440 on Ubuntu 18.04

@lifeel
Copy link

lifeel commented Aug 11, 2021

  1. When using CMAKE to compile, cancel the USE_CUDNN option.

It works for me to disable USE_CUDNN in cmake-gui !

My computer information:
Graphics Card : NVIDIA RTX 2080Ti
NVIDIA driver : 465.19.01
Cuda : 11.4
cuDNN 8.1
Ubuntu : 18.04.5 LTS

@ribeiro-hugo
Copy link

The same docker image works on P5000 GPU but fails on RTX4000

@alexanderhmw
Copy link

alexanderhmw commented Dec 14, 2021

  1. When using CMAKE to compile, cancel the USE_CUDNN option.

It works for me to disable USE_CUDNN with cmake .. -DUSE_CUDNN=OFF

GPU: NVIDIA RTX3090
NVIDIA Driver: 470.86
CUDA: 11.4
cuDNN: 8.3
Ubuntu: 20.04

@HospitableHost
Copy link

--net_resolution 160x80

still not work.

@amberhappy
Copy link

  1. When using CMAKE to compile, cancel the USE_CUDNN option.

It works for me to disable USE_CUDNN with cmake .. -DUSE_CUDNN=OFF

GPU: NVIDIA RTX3090 NVIDIA Driver: 470.86 CUDA: 11.4 cuDNN: 8.3 Ubuntu: 20.04

It works for me.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests