rPackages: compile r-torch from source#344593
rPackages: compile r-torch from source#344593PhDyellow wants to merge 3 commits intoNixOS:masterfrom
Conversation
Issues: - Nixpkgs doesn't have pytorch 2.0.1 - liblantern fails to compile
|
Pinigng @SomeoneSerge as you suggested you are willing to help with this. |
|
After posting this PR, I continued to test out different approaches to getting it to build. I am documenting what I have tried here, and what failed.
Searching online, it looks like the fbgemm issue is a pytorch issue in verison 2.0.1. Added the patch, tweaked the patch to suit nixpkgs. Now That's where this PR is at for now. Hopefully the |
|
After patching Tensorboard failed to build if I overrode The current code in this PR does not override The current issue is that the build for UPDATE: while re-running the build to get a build log to attach, I saw the errors this time. It appeared that there were issues with not being included in a file again, so uint8_t was not defined, but because it was part of an |
|
Adding |
|
I see it is a GCC13 change, where some |
|
|
|
It does not run correctly, |
Description of changes
The R package
torchprovides an R interface tolibtorch, without calling intermediate python code for speed.torchcan be compiled with CUDA support, allowing R users to work on GPUs. Currently,torchis broken in master but not marked as such.torchhas a relatively complex setup for an R package. The R code includes an interface to a compiled library,liblantern, which interfaces with the API oflibtorch. CUDA support can be compiled in. During installation, the R code is built, but if the env variableBUILD_LANTERNis set to1,torchwill attempt to compileliblanternas well. Otherwise, during first use of the package,torchwill attempt to either build or fetch binaries forliblanternandlibtorch. The fetched binaries need to be patched to work with Nix and NixOS, and the default installation directory is probably in the nix store, so the binaries have to be present during the nix build.In #271342, I succeeded in downloading the binaries during the build phase and patching them, and I was able to get work done on my local machine with GPU acceleration. I also managed to use the same PR code to run
torchon an HPC cluster using nix-gl-host.More recently, #328980 took another approach to downloading the binaries. When
torchis downloaded as a precompiled binary from CRAN, rather than as source code,liblanternandlibtorchare included. The binaries still need patching, but the nix code is cleaner, as the correct versions are already bundled together.Using precompiled binaries in Nixpkgs is not ideal, it goes against the spirit of nixpkgs, as well as duplicating
libtorchand related CUDA libraries in the nix store. However,torchis packaged on CRAN, and therefore it is provided by nixpkgs. It would be a confusing user experience for one CRAN package to require an unusual installation, such as setting up an overlay. I recommend merging one of the binary solutions when they are ready.This PR aims to compile
torchandliblanternfrom source, making use of the exisitng CUDA libraries andlibtorchin nixpkgs. It would make #328980 and #271342 redundant if we can succeed. This PR is a draft becauseliblanternfails to build, and I would like to work with @SomeoneSerge to fix it.The current issues:
liblanternfails to build. Build logs below.libtorchversion 2.0.1, whichliblanternexpectsliblanterndoes not appear to use, I have not addressed this yet.Some surprises I encountered while getting this PR to it's current state:
liblanternseems to have a bug in the CMake configuration files. Fixed by substituteInPlaceliblanternsource stripped out. Fixed by overriding thesrcattribute to point to the git repo.torchWithCuda.devThings done
nix.conf? (See Nix manual)sandbox = relaxedsandbox = truenix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)Add a 👍 reaction to pull requests you find important.