Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GPU stats features #526

Open
nicolargo opened this issue Aug 2, 2014 · 7 comments
Open

Add GPU stats features #526

nicolargo opened this issue Aug 2, 2014 · 7 comments

Comments

@nicolargo
Copy link
Contributor

GPU are more and more used in scientific servers. It will be nice to have GPU stats features into PSUtil.

For examples of existing monitoring GPU software for Intel, NVidia or AMD GPU, see the post http://www.rkblog.rk.edu.pl/w/p/monitoring-amd-intel-and-nvidia-graphics-card-usage-under-linux/

Source (in C) already exist for Intel GPU Top: http://anonscm.debian.org/cgit/pkg-xorg/app/intel-gpu-tools.git/tree/tools/intel_gpu_top.c

I will be a very nice feature asked byt Glances' users.

@giampaolo
Copy link
Owner

According to this http://askubuntu.com/a/5419 GPU info is not standardized and not retrievable via /proc as we currently do for the CPU stats. A tool like "Intel GPU Top" suggests that that same code probably won't work on other GPU chipsets, and that it would also probably require C headers to be installed separately. In summary, this looks like a world of pain. =)
It's probably something which might make sense to develop as a separate stand-alone python lib, but not into psutil.

@giampaolo
Copy link
Owner

I'm willing to reopen this to investigate whether there are viable options to implement this at least for nvidia cards as it seems they are the most used in the scientifi community.

@Gerardwx
Copy link
Contributor

NVIDIA already provides a library and has pypi package that provides python2 bindings, at least. https://developer.nvidia.com/nvidia-management-library-nvml

@ReenigneArcher
Copy link

Please add this! It would be a very nice addition to psutil.

Nvidia's official python module is pynvml

For AMD I found this module, pyamdgpuinfo, but it is currently linux only.

Yet another library, but can only get very basic information is gpu-info ... it's also on pypi, but no description there.

@DanielWicz
Copy link

Generally one wants some wrapper around nvidia-smi or rocm-smi (AMD). There's also intel's gpu (though less popular)

@giampaolo
Copy link
Owner

giampaolo commented Apr 16, 2023

From your bug report:

Backward compatibility between driver and binding versions.
Since CUDA 11, the definition of nvmlProcessInfo_t adds two new fields gpuInstanceId and computeInstanceId.

[...]

Another breaking change. nvidia-ml-py 11.515.0 (Jan 12, 2022) now even introduces v3 (nvmlDeviceGetComputeRunningProcesses_v3, etc.).

This is concerning. If the C lib breaks compatibility so easily [1], psutil would probably have to use #ifdef nvidia_version_x ... #else ... clauses all over the place, and that may create problems with the binary wheels that we distribute on PYPI. The system compiling the wheel may have a certain nvidia-lib version supporting functionality "X", but the user installing the psutil wheel may not, and that usually results in "X symbol not found" error at import time. I faced a similar problem in #1879, which led me to rewrite prlimit() functionality from C to ctypes for that reason. The fix was a literal "check for existence of X at run time instead of compilation time".

Also, right now we only depend on apt-get install python3-dev (Debian / Ubuntu) or yum install python3-devel (RedHat). Adding support for Nvidia GPUs means we'll have to install nvidialib-dev / nvidialib-devel (or whatever they're called). But not all Linux distros will provide a pre-compiled nvidialib-dev package, so we'll probably want logic in setup.py to make GPU functionality optional (aka not crash at compile time). But as a I explained above even that may not be enough due to the wheel related issues.

...and then there is Windows.

All of this to say that implementing this in pure C would be hard, which is probably why they decided to use ctypes in pynvml: https://github.com/gpuopenanalytics/pynvml/blob/master/pynvml/nvml.py

In summary: if we'll ever add GPU functionality in psutil we'll probably want to use ctypes. :)

[1] On a personal note: as a Linux user who's been dealing with Nvidia cards/driver issues for over a decade I'm not that surprised. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants