Skip to content
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
44 changes: 20 additions & 24 deletions pkgs/development/libraries/science/math/cudnn/generic.nix
Original file line number Diff line number Diff line change
@@ -1,8 +1,11 @@
{ stdenv
, lib
, cudatoolkit
, cudatoolkit ? null
, libcublas ? null
, zlib ? null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required regardless of which cudatoolkit distribution: cudnn on master is broken

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I thought ldd showed all was good before.

, fetchurl
, addOpenGLRunpath
, autoPatchelfHook
, autoAddOpenGLRunpathHook
, # The distributed version of CUDNN includes both dynamically liked .so files,
# as well as statically linked .a files. However, CUDNN is quite large
# (multiple gigabytes), so you can save some space in your nix store by
Expand All @@ -25,6 +28,8 @@ assert (hash != null) || (sha256 != null);
let
majorMinorPatch = version: lib.concatStringsSep "." (lib.take 3 (lib.splitVersion version));
version = majorMinorPatch fullVersion;
# Use libcublas if available
withoutCudaToolkit = libcublas != null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason to have a version with libcublas and another without? Is it purely for backwards compatibility with derivations still based on cudatoolkit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, so we only have redist packages for CUDA 11.4-11.6 currently. But according to https://developer.download.nvidia.com/compute/cuda/redist/ there are redist packages for all CUDA 11.x versions.

@FRidh is there a reason that we don't have redist packages for CUDA 11.0-11.3 as well as 11.4-11.6?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this boolean flag to the top-level? We shouldn't introduce more ? null bug-hiding traps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FRidh is there a reason that we don't have redist packages for CUDA 11.0-11.3 as well as 11.4-11.6?

there are unfortunately only manifest files for 11.4 to 11.6

Can we move this boolean flag to the top-level? We shouldn't introduce more ? null bug-hiding traps.

@SuperSandro2000 could you stop this nonsense. There is nothing wrong with adding ? null if you handle it. This boolean isn't an option, and is only there because this function is called for several versions, and for older versions the required attribute does not exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. There are manifest files for all of 11.x but they are mostly empty up until 11.4.2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing wrong with adding ? null if you handle it.

It is playing with fire. It is very easy to get things wrong and don't even notice it.

in stdenv.mkDerivation {
name = "cudatoolkit-${cudatoolkit.majorVersion}-cudnn-${version}";

Expand All @@ -35,7 +40,19 @@ in stdenv.mkDerivation {
inherit url hash sha256;
};

nativeBuildInputs = [ addOpenGLRunpath ];
nativeBuildInputs = [
autoAddOpenGLRunpathHook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know there way such a hook. Is this new?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I created it in the redist packages PR.

autoPatchelfHook
];

buildInputs = [
stdenv.cc.cc.lib
] ++ lib.optionals withoutCudaToolkit [
libcublas
zlib
] ++ lib.optionals (!withoutCudaToolkit) [
cudatoolkit
];

# Some cuDNN libraries depend on things in cudatoolkit, eg.
# libcudnn_ops_infer.so.8 tries to load libcublas.so.11. So we need to patch
Expand All @@ -46,15 +63,6 @@ in stdenv.mkDerivation {
installPhase = ''
runHook preInstall

function fixRunPath {
p=$(patchelf --print-rpath $1)
patchelf --set-rpath "''${p:+$p:}${lib.makeLibraryPath [ stdenv.cc.cc cudatoolkit.lib ]}:${cudatoolkit}/lib:\$ORIGIN/" $1
}

for sofile in {lib,lib64}/lib*.so; do
fixRunPath $sofile
done

mkdir -p $out
cp -a include $out/include
[ -d "lib/" ] && cp -a lib $out/lib
Expand All @@ -66,18 +74,6 @@ in stdenv.mkDerivation {
runHook postInstall
'';

# Set RUNPATH so that libcuda in /run/opengl-driver(-32)/lib can be found.
# See the explanation in addOpenGLRunpath.
postFixup = ''
for lib in $out/lib/lib*.so; do
addOpenGLRunpath $lib
done
'';

propagatedBuildInputs = [
cudatoolkit
];

passthru = {
inherit cudatoolkit;
majorVersion = lib.versions.major version;
Expand Down