-
-
Notifications
You must be signed in to change notification settings - Fork 18.1k
Use cuda_compat drivers when available #267247
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,27 @@ | ||
| # shellcheck shell=bash | ||
| # Patch all dynamically linked, ELF files with the CUDA driver (libcuda.so) | ||
| # coming from the cuda_compat package by adding it to the RUNPATH. | ||
| echo "Sourcing auto-add-cuda-compat-runpath-hook" | ||
|
|
||
| elfHasDynamicSection() { | ||
| patchelf --print-rpath "$1" >& /dev/null | ||
| } | ||
|
|
||
| autoAddCudaCompatRunpathPhase() ( | ||
| local outputPaths | ||
| mapfile -t outputPaths < <(for o in $(getAllOutputNames); do echo "${!o}"; done) | ||
| find "${outputPaths[@]}" -type f -executable -print0 | while IFS= read -rd "" f; do | ||
| if isELF "$f"; then | ||
| # patchelf returns an error on statically linked ELF files | ||
| if elfHasDynamicSection "$f" ; then | ||
| echo "autoAddCudaCompatRunpathHook: patching $f" | ||
| local origRpath="$(patchelf --print-rpath "$f")" | ||
| patchelf --set-rpath "@libcudaPath@:$origRpath" "$f" | ||
| elif (( "${NIX_DEBUG:-0}" >= 1 )) ; then | ||
| echo "autoAddCudaCompatRunpathHook: skipping a statically-linked ELF file $f" | ||
| fi | ||
| fi | ||
| done | ||
| ) | ||
|
|
||
| postFixupHooks+=(autoAddCudaCompatRunpathPhase) |
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -44,4 +44,24 @@ final: _: { | |
| ./auto-add-opengl-runpath-hook.sh | ||
| ) | ||
| {}; | ||
|
|
||
| # autoAddCudaCompatRunpathHook hook must be added AFTER `setupCudaHook`. Both | ||
| # hooks prepend a path with `libcuda.so` to the `DT_RUNPATH` section of | ||
| # patched elf files, but `cuda_compat` path must take precedence (otherwise, | ||
| # it doesn't have any effect) and thus appear first. Meaning this hook must be | ||
| # executed last. | ||
| autoAddCudaCompatRunpathHook = | ||
| final.callPackage | ||
| ( | ||
| {makeSetupHook, cuda_compat}: | ||
| makeSetupHook | ||
| { | ||
| name = "auto-add-cuda-compat-runpath-hook"; | ||
| substitutions = { | ||
| libcudaPath = "${cuda_compat}/compat"; | ||
| }; | ||
| } | ||
| ./auto-add-cuda-compat-runpath.sh | ||
| ) | ||
| {}; | ||
|
||
| } | ||
Uh oh!
There was an error while loading. Please reload this page.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think propagating a different version of
addOpenGLRunpathmight cause conflicts in derivations that explicitly consume their own version.Initially, I thought that all we'd need is
patchelf --add-neededwith the absolute path to${cuda_compat}/lib/libcuda.soin https://github.com/NixOS/nixpkgs/blob/bb142a6838c823a2cd8235e1d0dd3641e7dcfd63/pkgs/development/compilers/cudatoolkit/redist/build-cuda-redist-package.nix. This would ensure that anybody trying to load, for example,libcudart.soends up loading the cuda-compat driver first.Now I think that might be insufficient (how do we know people don't try to dlopen libcuda directly?) and you're right about overriding
addOpenGLRunpath. Except we might want to do that on an even more global scale, i.e. we might want to changepkgs.addOpenGLRunpathwhenever our nixpkgs instantiation happens to target an nvidia jetson (e.g.config.cudaSupport && hostPlatform.system == "aarch64-linux", but also maybe a special flag)All in all, I think we need to ponder at the available options a bit more. Thank you for starting this!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't a simple test that
cudaPackages.cuda_compatexists, as we do here, be sufficient? Basically, if there's a CUDA cuda_compat redist package available, we do that. Maybe we can add a test to see if we're not cross-compiling.I personally think changing
pkgs.addOpenGLRunpathwould make sense, as what we do is mostly to "extend" the usual path/run/opengl-driverto find cuda_compat'slibcuda.so. But we noticed that some packages are using e.g. the attributeaddOpenGLRunpath.driverLink, so we must be careful to not change the interface of the derivation.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and no, e.g. cf. #266475. I'd say, we probably don't want the definition of
addOpenGLRunpathto depend on the internals of thecudaPackagesset, but it's OK if it depends on a top-level attribute likeconfig. I haven't thought about this much thoughYes