cudatoolkit: prune broken symlinks in postFixup#217322
Conversation
|
Result of 2 packages marked as broken and skipped:
7 packages failed to build:
30 packages built:
|
654c9f4 to
edd9beb
Compare
I couldn't figure out a way to test with If there's a better way to handle PRs which depend on each other, please let me know. My current plan is to continue to rebase this as the PRs it depends on are merged into master. |
|
Result of 2 packages marked as broken and skipped:
2 packages failed to build:
35 packages built:
|
|
cc @NixOS/cuda-maintainers |
As cudatoolkit is currently written, 11.8 introduces a broken symlink in `include` (also named `include`) and in `lib` (named `lib64`). This trips up some consumers, like `tensorflow-gpu`.
edd9beb to
476de5c
Compare
|
Moved the comment out of the script and into Nix as requested. Also removed the PR stack information from the OP and rebased on master instead since it can be merged independently of the others. If you do run |
|
@samuela would you mind taking a look if you have a chance? |
|
running a nixpkgs-review run rn just to double check |
|
btw do TF/JAX build against CUDA 11.8 after this change? it would be great if we could finally upgrade our |
|
I believe JAX does, though I remember Tensorflow failing at the very end of its very, very long build with an error about GLIBCXX. I haven't looked too much at it since it's been a constant failure across all of my PRs, but I wonder if it has something to do with compiler versions gone awry? Might need to make sure the derivation is being built EDIT: the error is linked in the OP: https://gist.github.com/ConnorBaker/06ceb965a933ae0659dfce58f9a8c654#file-kv64kshayamrvh5jsfzkx9hzi2dsq81l-tensorflow-gpu-2-11-0-drv-log-L288 ImportError: /nix/store/ps7an26cirhh0xy1wrlc2icvfhrd39cj-gcc-11.3.0-lib/lib/libstdc++.so.6: version `GLIBCXX_3.4.30' not found (required by /nix/store/jlx2nfpi73sjb0f3096cly5ik8arw9k9-icu4c-72.1/lib/libicuuc.so.72)EDIT2: Seems like this has been reported elsewhere #216361. |
|
ugh yeah TF is the worst to build... well that's not due to anything in this change. Seems like progress since IIRC before TF was failing at the beginning of the build with the same error that JAX had. So that's great news! |
|
All of the failures I see from running Result of 2 packages marked as broken and skipped:
4 packages failed to build:
33 packages built:
|
|
Result of 2 packages marked as broken and skipped:
4 packages failed to build:
33 packages built:
|
|
LGTM thanks so much @ConnorBaker ! |
|
This pull request has been mentioned on NixOS Discourse. There might be relevant details there: https://discourse.nixos.org/t/tweag-nix-dev-update-45/26397/1 |
Description of changes
As cudatoolkit is currently written, 11.8 introduces a broken symlink in
include(also namedinclude) and inlib(namedlib64).This trips up some consumers, (e.g., it causes the build of
tensorflow-gputo fail), and will be a problem when switching to 11.8 as the default.Things done
sandbox = trueset innix.conf? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)Failures
masterafter addingTensorRT-8.4.0.6.Linux.x86_64-gnu.cuda-11.6.cudnn8.3.tar.gzto my store (per the guidance in the first error message, https://gist.github.com/ConnorBaker/9459ce1c741984959e78296c15e52f1e).master.