-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Failed to initialize NVML: Unknown Error #110
Comments
It seems the error on stdout is
When I Google the error, I find very similar issus such as:
Can you have a look at them? I don't think this is an issue with the exporter, because the exporter is just a dumb tool running |
I've tried getting a number of nvidia tools working on Docker before, and I think I see something in your Docker info that could be the problem @jangrewe. Namely while you have the nvidia runtime set, it is not your default. Perhaps that is the issue?
|
Thanks @nicklausbrown, i'll try running with |
Here is very good explanation of this issue: NVIDIA/nvidia-docker#1730 |
Describe the bug
I'm running the current version of your Docker image, and it works most of the time - but sometimes it starts to fail, and i need to restart the container.
It sometimes runs for a whole day, and sometimes only a couple of minutes.
To Reproduce
Steps to reproduce the behavior:
Expected behavior
I'd expect the exporter to not start throwing errors ;-)
Console output
(Disregard the mismatching timestamps, i copypasta'd the error first, and then also added the initial log when starting the container.)
(The error from the title is at the end of this very long last line.)
Model and Version
Running on Docker with Nvidia Container Toolkit:
The text was updated successfully, but these errors were encountered: