Skip to content

cudaPackages.cudnn: use libcublas if available#168755

Closed
FRidh wants to merge 1 commit intoNixOS:masterfrom
FRidh:cudnn
Closed

cudaPackages.cudnn: use libcublas if available#168755
FRidh wants to merge 1 commit intoNixOS:masterfrom
FRidh:cudnn

Conversation

@FRidh
Copy link
Member

@FRidh FRidh commented Apr 15, 2022

Additionally, perform the patching using hooks.

Note that cudatoolkit is no longer propagated. This may cause some
breakage.

Description of changes
Things done
  • Built on platform(s)
    • x86_64-linux
    • aarch64-linux
    • x86_64-darwin
    • aarch64-darwin
  • For non-Linux: Is sandbox = true set in nix.conf? (See Nix manual)
  • Tested, as applicable:
  • Tested compilation of all packages that depend on this change using nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage
  • Tested basic functionality of all binary files (usually in ./result/bin/)
  • 22.05 Release Notes (or backporting 21.11 Release notes)
    • (Package updates) Added a release notes entry if the change is major or breaking
    • (Module updates) Added a release notes entry if the change is significant
    • (Module addition) Added a release notes entry if adding a new NixOS module
    • (Release notes changes) Ran nixos/doc/manual/md-to-db.sh to update generated release notes
  • Fits CONTRIBUTING.md.

Additionally, perform the patching using hooks.

Note that cudatoolkit is no longer propagated. This may cause some
breakage.
@FRidh FRidh added the 6.topic: cuda Parallel computing platform and API label Apr 15, 2022
@FRidh FRidh requested review from SomeoneSerge and samuela April 15, 2022 06:55
@FRidh
Copy link
Member Author

FRidh commented Apr 15, 2022

Duplicate of #168748.

@ofborg ofborg bot added 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux. labels Apr 15, 2022
, cudatoolkit
, cudatoolkit ? null
, libcublas ? null
, zlib ? null
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Required regardless of which cudatoolkit distribution: cudnn on master is broken

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I thought ldd showed all was good before.


nativeBuildInputs = [ addOpenGLRunpath ];
nativeBuildInputs = [
autoAddOpenGLRunpathHook
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't know there way such a hook. Is this new?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I created it in the redist packages PR.

majorMinorPatch = version: lib.concatStringsSep "." (lib.take 3 (lib.splitVersion version));
version = majorMinorPatch fullVersion;
# Use libcublas if available
withoutCudaToolkit = libcublas != null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the reason to have a version with libcublas and another without? Is it purely for backwards compatibility with derivations still based on cudatoolkit?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah ok, so we only have redist packages for CUDA 11.4-11.6 currently. But according to https://developer.download.nvidia.com/compute/cuda/redist/ there are redist packages for all CUDA 11.x versions.

@FRidh is there a reason that we don't have redist packages for CUDA 11.0-11.3 as well as 11.4-11.6?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this boolean flag to the top-level? We shouldn't introduce more ? null bug-hiding traps.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@FRidh is there a reason that we don't have redist packages for CUDA 11.0-11.3 as well as 11.4-11.6?

there are unfortunately only manifest files for 11.4 to 11.6

Can we move this boolean flag to the top-level? We shouldn't introduce more ? null bug-hiding traps.

@SuperSandro2000 could you stop this nonsense. There is nothing wrong with adding ? null if you handle it. This boolean isn't an option, and is only there because this function is called for several versions, and for older versions the required attribute does not exist.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh I see. There are manifest files for all of 11.x but they are mostly empty up until 11.4.2

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is nothing wrong with adding ? null if you handle it.

It is playing with fire. It is very easy to get things wrong and don't even notice it.

@FRidh
Copy link
Member Author

FRidh commented Apr 19, 2022

merged #168748

@FRidh FRidh closed this Apr 19, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

6.topic: cuda Parallel computing platform and API 10.rebuild-darwin: 1-10 This PR causes between 1 and 10 packages to rebuild on Darwin. 10.rebuild-linux: 11-100 This PR causes between 11 and 100 packages to rebuild on Linux.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants