python3Packages.torch: 1.13.1 -> 2.0.0#222273
Conversation
|
|
|
Result of 58 packages failed to build:
96 packages built:
|
Failed derivationsDetails
|
|
Result of 75 packages failed to build:
79 packages built:
|
|
Result of 75 packages failed to build:
83 packages built:
|
Failed derivationsDetails
|
|
Ah, I forgot to rename |
|
@SomeoneSerge not sure how relevant these are to you (I'll probably throw them in issues at some point) but I was in the process of rewriting the 1.13 derivation to fix some stuff. You may be interested in some of the changes (I included a list of things I changed): https://gist.github.com/ConnorBaker/15df60d501eab2ca9a48072ea54d2786 |
|
@ConnorBaker Oh, this is so great! I was looking at some of the same annoyances and thinking "maybe we catch up with 2.0.0 before we refactor?" I'm glad to see you got these changes working. I think the refactor should go in a separate PR. I also have a suspicion the review for the current one may take a while (e.g. I'm touching llvmPackages_rocm in a somewhat un-canonical way, etc? nixpkgs-review runs are hefty too, and I'm waiting for another one), so should you feel like your change is ready and you want to take priority, I won't mind |
|
Result of 56 packages failed to build:
102 packages built:
|
Failed derivationsDetails
|
❯ patchelf --print-rpath /nix/store/d6n4mdxssvf596n1rvmvmll0dxqixykw-python3.10-triton-2.0.0/lib/python3.10/site-packages/triton/_C/libtriton.so
/nix/store/rg9b3w1rqm7b5cwr9g8d0z6jii1g462n-zlib-1.2.13/lib:/nix/store/q75826qw95k82qkmqnj3haw63zp73w2g-ncurses-6.4/lib:/nix/store/8bmp6r3a0xfha3wj36phlc47clh9w81l-glibc-2.35-224/lib:/nix/store/yiflcg7zmirny3654g8l8f85sz958gqk-gcc-11.3.0-lib/libCC @katanallama can you give this another try? |
Can confirm that now matches. Though now |
|
@katanallama yes, the offending line is https://github.com/openai/triton/blob/3239c93a934d3e8d431da00d9076e7f9b6d7f69d/python/triton/compiler.py#L1513 I think I'm just going to patch it for now |
This comment was marked as outdated.
This comment was marked as outdated.
I believe fixes the ptxas issue as it's still executable for me. |
|
Ok, merging is currently blocked on https://discourse.nixos.org/t/prs-ready-for-review/3032/2037. If we fail to find someone with the right hardware in the next few days, I vote that we go ahead and merge. |
|
I suggest that we merge tomorrow. Even if we break rocm, we'll just get feedback faster and fix things. The release is not an issue since we can always backport |
…with cuda-compatible stdenv
catch up with pytorch 2.0.0 and updated interfaces
The drawback of this is that the comments now affect outPath's. Hopefully, though, we'll remove this preFixup soon anyway Co-authored-by: Sandro <sandro.jaeckel@gmail.com>
Looks like we'll need a rebase, but other than that I agree we're good to go! |
|
Thanks so much for your hard work here @SomeoneSerge ! This is an important update for the whole ML ecosystem. |
|
On Darwin: |
|
@Et7f3 Hmm, this looks to be a C++11 vs 17 issue:
Perhaps this is the sort of thing we should raise with upstream? |
Hydra can't cache it anyway at the moment because none of the workarounds for |
Thanks, I had no idea! |
Warnings
XXXX-XX-XX: After this change,
torchWithRocmis going to requireallowUnfree = truebecausepython3Packages.openai-tritonassumes a copy of cudatoolkit's ptxas in their site-packages. I'm preparing a follow-up PR that patches this around, but in the meantime hydra won't be buildingtorchWithRocm2023-04-06: Upstream uses a fork of triton: https://github.com/ROCmSoftwarePlatform/triton/releases/tag/pytorch-triton-rocm-v2.0.1, which follows LLVM 17. This PR uses OpenAI's triton repo and LLVM 15, so beware of possible deviations
Description of changes
Things done
sandbox = trueset innix.conf? (See Nix manual)nix-shell -p nixpkgs-review --run "nixpkgs-review rev HEAD". Note: all changes have to be committed, also see nixpkgs-review usage./result/bin/)