python3Packages.pytorch: migrate to cudaPackages#168745
python3Packages.pytorch: migrate to cudaPackages#168745SomeoneSerge wants to merge 8 commits intoNixOS:masterfrom
Conversation
samuela
left a comment
There was a problem hiding this comment.
hell yeah! this looks dope. does it build and everything?
|
cudnn using libcublas instead of cudatoolkit if possible #168755 |
b6a7106 to
7cf6e3c
Compare
|
@FRidh same point arises: this assumes redist packages are present unconditionally, i.e. it only works for cuda>=11. I think we don't have to support cuda10 in downstream derivations |
7cf6e3c to
0abfff5
Compare
09ad034 to
e415e1a
Compare
Use the redistributable cuda packages, instead of runfile-based cudatoolkit
|
I'm not sure I'm ready for this to be merged:
|
|
Now magma uses redist packages too, reducing the reported closure size of pytorch from |
|
...note that this change doesn't support building magma with cuda10 (pre-cuda11.4) We could account for those as well, but I thought it's ok to ignore because |
|
Interesting that |
Although I think it may technically be possible to hook up NVIDIA hardware to a macOS system, I've never heard of anyone actually doing this... except maybe for kicks? |
|
RE: I've actually just noticed a bug with the original expression. ❯ nix eval --impure --expr '((builtins.getFlake github:NixOS/nixpkgs/master).legacyPackages.x86_64-linux.python3Packages.pytorch).cudaSupport'
false
❯ nix eval --impure --expr '((builtins.getFlake github:NixOS/nixpkgs/master).legacyPackages.x86_64-linux.python3Packages.pytorchWithCuda).cudaSupport'
error: Package ‘cudatoolkit-11.6.1’ ... has an unfree license (‘unfree’), refusing to evaluateThis happens because of the following line in assert !cudaSupport || magma.cudatoolkit == cudatoolkit;So, just to verify, in this PR we don't seem to be repeating that same error, although the whole thing kind of smells ❯ nix eval --impure --expr '((builtins.getFlake (toString ./.)).legacyPackages.x86_64-linux.python3Packages.pytorch).cudaSupport'
false
❯ nix eval --impure --expr '((builtins.getFlake (toString ./.)).legacyPackages.x86_64-linux.python3Packages.pytorchWithCuda).cudaSupport'
trueFootnotes
|
|
I suspect we might still run into similar evaluation issues if we make an overlay resulting in lack of pointer-equality 🤔 |
pytorch = callPackage ../development/python-modules/pytorch {
# ...
cudaPackages = pkgs.cudaPackages.overrideScope' (final: prev: { });
}Then ...and |
|
Might want to write a function that takes a list of sets and a list of packages and tests for every package whether it is equal in every set. |
|
Just one thing I don't like about this is that the user cannot disable those asserts downstream, but I can live with that |
|
|
We definitely should write our own equality function just for isolation, regardless of whether it falls back to |
db047d2 to
befe56a
Compare
| assert !cudaSupport || magma.cudatoolkit == cudatoolkit; | ||
| # We expect referential equality of all cudaPackages used to ensure consistency | ||
| # You can make an overlay and pass the same cudaPackages to pytorch, mpi, and magma | ||
| # TODO: `==` is an implementation detail; move comparison logic to cudaPackages |
|
@SomeoneSerge if you don't mind, I'd like to take this up and continue where you left off. |
|
@ConnorBaker this would be just great, please do! |

Use the redistributable cuda packages, instead of runfile-based
cudatoolkit
Build log: https://gist.github.com/SomeoneSerge/4e5b5e9d9ee82c410eebad60d0cdc8f7All went awry, it used cudatoolkit through cudnn. ReworkingEDIT later on 2022-04-15: Now it actually builds with cuda-redist! Rebased on the cudnn PR with removed propagatedBuildInputs, adjusted, and pushed
CC @NixOS/cuda-maintainers
Description of changes
Things done
sandbox = trueset innix.conf? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)nixos/doc/manual/md-to-db.shto update generated release notes