-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use CUDA wheels to avoid statically linking CUDA components in our wheels #35
Comments
Putting up some of my notes from poking around at this today. On an x86_64 machine, ran the following. docker run \
--rm \
-it rapidsai/ci-wheel:cuda12.2.2-rockylinux8-py3.10 \
bash Looked for the default python -c "import sysconfig; print(sysconfig.get_path('platlib'))"
# /pyenv/versions/3.10.14/lib/python3.10/site-packages Installed some stuff. python -m pip install \
--extra-index-url https://pypi.nvidia.com \
nvidia-cusparse-cu12 \
nvidia-cublas-cu12 \
nvidia-cufft-cu12 \
'pylibraft-cu12==24.4.*' Ok, so where did it put all those CUDA libraries? find \
/pyenv/versions/3.10.14/lib/python3.10/site-packages \
-type f \
-name 'libcu*.so*'
# /pyenv/versions/3.10.14/lib/python3.10/site-packages/nvidia/cublas/lib/libcublasLt.so.12
# /pyenv/versions/3.10.14/lib/python3.10/site-packages/nvidia/cublas/lib/libcublas.so.12
# /pyenv/versions/3.10.14/lib/python3.10/site-packages/nvidia/cufft/lib/libcufft.so.11
# /pyenv/versions/3.10.14/lib/python3.10/site-packages/nvidia/cufft/lib/libcufftw.so.11
# /pyenv/versions/3.10.14/lib/python3.10/site-packages/nvidia/cusparse/lib/libcusparse.so.12 And what about find \
/pyenv/versions/3.10.14/lib/python3.10/site-packages \
-type f \
-name 'libraft*.so*'
# /pyenv/versions/3.10.14/lib/python3.10/site-packages/pylibraft/libraft.so So by default, it looks like those CUDA libraries will be at this path relative to
Where I saw at least one example where two libraries that depend on each other are installed together, and one has SITE=" /pyenv/versions/3.10.14/lib/python3.10/site-packages"
ldd ${SITE}/nvidia/cublas/lib/libcublas.so.12
# libcublasLt.so.12 => /pyenv/versions/3.10.14/lib/python3.10/site-packages/nvidia/cublas/lib/libcublasLt.so.12 (0x00007f1c23f31000
readelf -d ${SITE}/nvidia/cublas/lib/libcublas.so.12 | grep PATH
# 0x000000000000001d (RUNPATH) Library runpath: [$ORIGIN] Tomorrow, I'll try a wheel build of Dumping some other links I've been consulting to get a better understanding of the difference between what happens to be true in the places I'm testing and what we can assume to be true about installation layouts generally.
|
Usage of the CUDA math libraries is independent of the CUDA runtime. Make their static/shared status separately controllable. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Vyas Ramasubramani (https://github.com/vyasr) URL: #190
Usage of the CUDA math libraries is independent of the CUDA runtime. Make their static/shared status separately controllable. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Bradley Dice (https://github.com/bdice) - Robert Maynard (https://github.com/robertmaynard) URL: #2376
Usage of the CUDA math libraries is independent of the CUDA runtime. Make their static/shared status separately controllable. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Robert Maynard (https://github.com/robertmaynard) URL: #5959
Usage of the CUDA math libraries is independent of the CUDA runtime. Make their static/shared status separately controllable. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Vyas Ramasubramani (https://github.com/vyasr) URL: #4526
Usage of the CUDA math libraries is independent of the CUDA runtime. Make their static/shared status separately controllable. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Ben Frederickson (https://github.com/benfred) URL: #216
#190 was supposed to separate static CUDA math libraries from static CUDA runtime library, but accidentally pulled the runtime along with the math libraries. The way we'd normally fix this is by creating a separate variable for the runtime. However, since this project doesn't actually use any math libraries, we can just revert the whole thing. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Vyas Ramasubramani (https://github.com/vyasr) URL: #192
Use CUDA math wheels to reduce wheel size by not statically linking CUDA math libraries. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Bradley Dice (https://github.com/bdice) - Vyas Ramasubramani (https://github.com/vyasr) URL: #5966
With packages depending on CUDA wheels by default, we want to disable that behavior in devcontainers. Add a `use_cuda_wheels=false` matrix entry. Contributes to rapidsai/build-planning#35
Use CUDA math wheels to reduce wheel size by not statically linking CUDA math libraries. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) - Robert Maynard (https://github.com/robertmaynard) - Bradley Dice (https://github.com/bdice) URL: #298
We want to be able to control whether or not the wheel uses the CUDA wheels. Add a `use_cuda_wheels` matrix entry to control this. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) URL: #6038
Use CUDA math wheels to reduce wheel size by not statically linking CUDA math libraries. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - Robert Maynard (https://github.com/robertmaynard) - Bradley Dice (https://github.com/bdice) - James Lamb (https://github.com/jameslamb) URL: #2415
Use CUDA math wheels to reduce wheel size by not statically linking CUDA math libraries. Contributes to rapidsai/build-planning#35 Authors: - Kyle Edwards (https://github.com/KyleFromNVIDIA) Approvers: - James Lamb (https://github.com/jameslamb) - Robert Maynard (https://github.com/robertmaynard) - Chuck Hastings (https://github.com/ChuckHastings) - Bradley Dice (https://github.com/bdice) URL: #4621
We’ve mostly explored this for CUDA math library components thus far but we should do the same for nvcomp: https://pypi.org/project/nvidia-nvcomp-cu12/ RAPIDS should shift to using nvcomp wheels as a dependency of our own wheel builds of cudf and kvikio so we do not redistribute nvcomp libraries as part of the cudf and kvikio wheels. |
Using them as runtime, and will follow up with the CUDA team as the issues occur. |
In last week's meeting we decided to hold off on calling this done because we weren't sure about the state of cugraph-ops. Based on some offline discussions on that front I think that we can close this issue. @KyleFromNVIDIA please reopen if you think that I'm missing something. |
In order to achieve manylinux compliance, RAPIDS wheels currently statically link all components of the CTK that they consume. This leads to heavily bloated binaries, especially when the effect is compounded across many packages. Since NVIDIA now publishes wheels containing the CUDA libraries and these libraries have been stress tested by the wheels for various deep learning frameworks (e.g. pytorch now depends on the CUDA wheels), RAPIDS should now do the same to reduce our wheel sizes. This work is a companion to #33 that should probably be tackled afterwards since #33 will reduce the scope of these changes to just the resulting C++ wheels, a meaningful reduction since multiple RAPIDS repos produce multiple wheels. While the goals of this are aligned with #33 and the approach is similar, there are some notable differences because of the way the CUDA wheels are structured. In particular, they are not really designed to be compiled against, only run against. They do generally seem to contain both includes and libraries, which is helpful, but they do not contain any CMake or other packaging metadata, nor do they contain the multiple symlinked copies of libraries (e.g. linker name->soname->library name). The latter is a fundamental limitation of wheels not supporting symlinks, but could cause issues for library discovery using standardized solutions like CMake's FindCUDAToolkit or pkg-config that rely on a specific version of those files existing (AFAICT only the SONAME is present). We should stage work on this in a way that minimizes conflicts with #31 and #33, both of which should facilitate this change. I propose the following, but all of it is open for discussion:
--exclude
flag. The resulting wheel should be inspected to verify that all CUDA math libraries have been excluded from the build. Note that (at least for now) we want to continue statically linking the CUDA runtime. This change will likely require some CMake work to decouple static linking of cudart from the static linking of other CUDA libraries.LD_LIBRARY_PATH
DLFW / devcontainers adjustments
The text was updated successfully, but these errors were encountered: